Pseudotime Sexual Branch
We have merged the two datasets together in GCSKO_merge.Rmd. We also subsetted out the pre-sexual-branch and the sexual cells (male and female) and stored them in a Seurat object called tenx.mutant.integrated.sex. Here, we will perform pseudotime analysis on the dataset and build modules of genes that show similar expression across this pseudotime.
[1] "patchwork is loaded correctly"
[1] "viridis is loaded correctly"
[1] "Seurat is loaded correctly"
[1] "cowplot is loaded correctly"
[1] "gridExtra is loaded correctly"
[1] "grid is loaded correctly"
[1] "Hmisc is loaded correctly"
[1] "reshape2 is loaded correctly"
[1] "dplyr is loaded correctly"
[1] "monocle3 is loaded correctly"
Loading required package: destiny
there is no package called ‘destiny’
[1] "trying to install destiny"
Bioconductor version 3.11 (BiocManager 1.30.10), R 4.0.2 (2020-06-22)
Installing package(s) 'destiny'
also installing the dependencies ‘DEoptimR’, ‘xts’, ‘robustbase’, ‘vcd’, ‘laeken’, ‘ranger’, ‘TTR’, ‘pcaMethods’, ‘ggplot.multistats’, ‘ggthemes’, ‘VIM’, ‘knn.covertree’, ‘smoother’, ‘scatterplot3d’
cannot open URL 'https://bioconductor.org/packages/3.11/data/annotation/bin/macosx/contrib/4.0/PACKAGES.rds': HTTP status was '404 Not Found'cannot open URL 'https://bioconductor.org/packages/3.11/data/annotation/bin/macosx/contrib/4.0/PACKAGES.gz': HTTP status was '404 Not Found'cannot open URL 'https://bioconductor.org/packages/3.11/data/experiment/bin/macosx/contrib/4.0/PACKAGES.rds': HTTP status was '404 Not Found'cannot open URL 'https://bioconductor.org/packages/3.11/data/experiment/bin/macosx/contrib/4.0/PACKAGES.gz': HTTP status was '404 Not Found'cannot open URL 'https://bioconductor.org/packages/3.11/workflows/bin/macosx/contrib/4.0/PACKAGES.rds': HTTP status was '404 Not Found'cannot open URL 'https://bioconductor.org/packages/3.11/workflows/bin/macosx/contrib/4.0/PACKAGES.gz': HTTP status was '404 Not Found'Package which is only available in source form, and may need
compilation of C/C++/Fortran: ‘destiny’
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/DEoptimR_1.0-8.tgz'
Content type 'application/x-gzip' length 39620 bytes (38 KB)
==================================================
downloaded 38 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/xts_0.12.1.tgz'
Content type 'application/x-gzip' length 927873 bytes (906 KB)
==================================================
downloaded 906 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/robustbase_0.93-6.tgz'
Content type 'application/x-gzip' length 3300888 bytes (3.1 MB)
==================================================
downloaded 3.1 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/vcd_1.4-7.tgz'
Content type 'application/x-gzip' length 1557206 bytes (1.5 MB)
==================================================
downloaded 1.5 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/laeken_0.5.1.tgz'
Content type 'application/x-gzip' length 2913969 bytes (2.8 MB)
==================================================
downloaded 2.8 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/ranger_0.12.1.tgz'
Content type 'application/x-gzip' length 2003695 bytes (1.9 MB)
==================================================
downloaded 1.9 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/TTR_0.24.2.tgz'
Content type 'application/x-gzip' length 525804 bytes (513 KB)
==================================================
downloaded 513 KB
trying URL 'https://bioconductor.org/packages/3.11/bioc/bin/macosx/contrib/4.0/pcaMethods_1.80.0.tgz'
Content type 'application/x-gzip' length 1197476 bytes (1.1 MB)
==================================================
downloaded 1.1 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/ggplot.multistats_1.0.0.tgz'
Content type 'application/x-gzip' length 29775 bytes (29 KB)
==================================================
downloaded 29 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/ggthemes_4.2.0.tgz'
Content type 'application/x-gzip' length 426524 bytes (416 KB)
==================================================
downloaded 416 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/VIM_6.0.0.tgz'
Content type 'application/x-gzip' length 1738658 bytes (1.7 MB)
==================================================
downloaded 1.7 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/knn.covertree_1.0.tgz'
Content type 'application/x-gzip' length 686790 bytes (670 KB)
==================================================
downloaded 670 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/smoother_1.1.tgz'
Content type 'application/x-gzip' length 22584 bytes (22 KB)
==================================================
downloaded 22 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.0/scatterplot3d_0.3-41.tgz'
Content type 'application/x-gzip' length 333688 bytes (325 KB)
==================================================
downloaded 325 KB
The downloaded binary packages are in
/var/folders/wj/rztzclxn1t10cl2sk0plbf3r0000gn/T//RtmpXGhNV1/downloaded_packages
installing the source package ‘destiny’
trying URL 'https://bioconductor.org/packages/3.11/bioc/src/contrib/destiny_3.2.0.tar.gz'
Content type 'application/x-gzip' length 8979926 bytes (8.6 MB)
==================================================
downloaded 8.6 MB
* installing *source* package ‘destiny’ ...
** using staged installation
** libs
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/grDevices/include' -I/usr/local/include -ggdb -fPIC -Wall -g -O2 -c RcppExports.cpp -o RcppExports.o
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:535:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:2:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/LU:47:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:3:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Cholesky:12:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Jacobi:29:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:3:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Cholesky:43:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/QR:17:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Householder:27:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:5:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SVD:48:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:6:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Geometry:58:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Eigenvalues:58:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:26:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SparseCore:66:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:27:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/OrderingMethods:71:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:29:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SparseCholesky:43:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:32:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SparseQR:35:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:33:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/IterativeLinearSolvers:46:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:32:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/CholmodSupport:45:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:35:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/KroneckerProduct:34:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/../../Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:39:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/Polynomials:135:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/../../Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from RcppExports.cpp:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:40:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/SparseExtra:51:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/../../Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
17 warnings generated.
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/grDevices/include' -I/usr/local/include -ggdb -fPIC -Wall -g -O2 -c censoring.cpp -o censoring.o
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:535:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:2:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/LU:47:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:3:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Cholesky:12:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Jacobi:29:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:3:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Cholesky:43:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:4:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/QR:17:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Householder:27:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:5:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SVD:48:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:6:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Geometry:58:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:30:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Eigenvalues:58:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:26:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SparseCore:66:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:27:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/OrderingMethods:71:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:29:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SparseCholesky:43:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:32:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/SparseQR:35:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:31:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Sparse:33:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/IterativeLinearSolvers:46:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:32:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/CholmodSupport:45:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:35:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/KroneckerProduct:34:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/../../Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:39:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/Polynomials:135:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/../../Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
In file included from censoring.cpp:7:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigen.h:25:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/RcppEigenForward.h:40:
In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/SparseExtra:51:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported/Eigen/../../Eigen/src/Core/util/ReenableStupidWarnings.h:10:30: warning: pragma diagnostic pop could not pop, no matching push [-Wunknown-pragmas]
#pragma clang diagnostic pop
^
censoring.cpp:60:15: warning: variable 'm0' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
censoring.cpp:71:20: note: uninitialized use occurs here
* ( std::erfc((m0-v) / sigma) - std::erfc((m1-v) / sigma) )
^~
censoring.cpp:60:11: note: remove the 'if' if its condition is always true
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
censoring.cpp:60:15: warning: variable 'm0' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~
censoring.cpp:71:20: note: uninitialized use occurs here
* ( std::erfc((m0-v) / sigma) - std::erfc((m1-v) / sigma) )
^~
censoring.cpp:60:15: note: remove the '&&' if its condition is always true
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~
censoring.cpp:55:13: note: initialize the variable 'm0' to silence this warning
double m0, m1;
^
= 0.0
censoring.cpp:60:15: warning: variable 'm1' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
censoring.cpp:71:48: note: uninitialized use occurs here
* ( std::erfc((m0-v) / sigma) - std::erfc((m1-v) / sigma) )
^~
censoring.cpp:60:11: note: remove the 'if' if its condition is always true
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
censoring.cpp:60:15: warning: variable 'm1' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~
censoring.cpp:71:48: note: uninitialized use occurs here
* ( std::erfc((m0-v) / sigma) - std::erfc((m1-v) / sigma) )
^~
censoring.cpp:60:15: note: remove the '&&' if its condition is always true
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~
censoring.cpp:55:17: note: initialize the variable 'm1' to silence this warning
double m0, m1;
^
= 0.0
censoring.cpp:60:15: warning: variable 'use_d' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
censoring.cpp:66:21: note: uninitialized use occurs here
const double v = use_d ? d : c;
^~~~~
censoring.cpp:60:11: note: remove the 'if' if its condition is always true
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
censoring.cpp:60:15: warning: variable 'use_d' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~
censoring.cpp:66:21: note: uninitialized use occurs here
const double v = use_d ? d : c;
^~~~~
censoring.cpp:60:15: note: remove the '&&' if its condition is always true
} else if (!one_uncertain && one_missing) {
^~~~~~~~~~~~~~~~~
censoring.cpp:18:12: note: initialize the variable 'use_d' to silence this warning
bool use_d;
^
= false
23 warnings generated.
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include' -I'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/grDevices/include' -I/usr/local/include -ggdb -fPIC -Wall -g -O2 -c utils.cpp -o utils.o
clang++ -mmacosx-version-min=10.13 -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o destiny.so RcppExports.o censoring.o utils.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
installing to /Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-destiny/00new/destiny/libs
** R
** data
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (destiny)
The downloaded source packages are in
‘/private/var/folders/wj/rztzclxn1t10cl2sk0plbf3r0000gn/T/RtmpXGhNV1/downloaded_packages’
Old packages: 'ape', 'backports', 'callr', 'car', 'ComplexHeatmap',
'conquer', 'cowplot', 'data.table', 'deldir', 'devtools', 'dplyr',
'DT', 'fields', 'fs', 'glue', 'Hmisc', 'jsonlite', 'lmtest',
'maptools', 'MASS', 'mgcv', 'nlme', 'pbapply', 'processx', 'ps',
'quantreg', 'RcppArmadillo', 'RcppHNSW', 'remotes', 'renv', 'Seurat',
'sf', 'shape', 'stringi', 'sys', 'tidyr', 'vctrs', 'xfun', 'zip'
Update all/some/none? [a/s/n]:
Loading required package: destiny
Attaching package: ‘destiny’
The following object is masked from ‘package:SummarizedExperiment’:
distance
The following object is masked from ‘package:GenomicRanges’:
distance
The following object is masked from ‘package:IRanges’:
distance
[1] "destiny installed and loaded"
## load sex only branch cells saved from GCSKO_Sex_Branch_Analysis.Rmd
## Restore the objects
## load sex branch dataset
tenx.mutant.integrated.sex <- readRDS("../data_to_export/tenx.mutant.integrated.sex.RDS")
## load full dataset
#tenx.mutant.integrated <- readRDS("../data_to_export/tenx.mutant.integrated.RDS")
## add old pt values
## these values are calculated in hte GCSKO_pseudotime_allcells.Rmd script and so were added after the object was split into sex only.
tenx.all <- readRDS("../data_to_export/tenx.mutant.integrated.RDS")
tenx.all.meta <- as.data.frame(tenx.all@meta.data)
tenx.all.meta <- tenx.all.meta[which(rownames(tenx.all.meta) %in% rownames(tenx.mutant.integrated.sex@meta.data)),]
tenx.mutant.integrated.sex <- AddMetaData(tenx.mutant.integrated.sex, tenx.all.meta$old_pt_values, col.name = "old_pt_values")
## then remove these objects so they don't use up memory
rm(c(tenx.all, tenx.all.meta))
Error in rm(c(tenx.all, tenx.all.meta)) :
... must contain names or character strings
## create a list of our mutant gene IDs
list_of_mutant_genes <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800")
We will now re-calculate the UMAP, PCA and diffusion map to visualise the data since the old visualisation used the variation in the whole dataset and so some of the variation in this set of cells was obscured.
ref: https://github.com/satijalab/seurat/issues/1883
The PCA is used as the basis of other dimensionality reductions so we will now recalculate this to get to our final UMAP.
First, run PCA again
tenx.mutant.integrated.sex <- RunPCA(tenx.mutant.integrated.sex, npcs = 30, verbose = FALSE)
Then inspect the PCs
ElbowPlot(tenx.mutant.integrated.sex, ndims = 30, reduction = "pca")
Have a quick look at the output
FeaturePlot(tenx.mutant.integrated.sex, reduction = "pca", pt.size = 0.01, features = "old_pt_values")
calculate UMAP
check markers to orientate
plots <- FeaturePlot(tenx.mutant.integrated.sex, features = c("PBANKA-1319500", "PBANKA-0416100"), blend = TRUE, combine = FALSE, coord.fixed = TRUE, reduction = "umapoptimised_post_repca")
plots[[3]] + NoLegend() # Get just the co-expression plot, built-in legend is meaningless for this plot
plots[[4]] # Get just the key
CombinePlots(plots[3:4], legend = 'none', ncol =2, nrow = 1, rel_widths = c(2, 1), rel_heights = c(4,1)) # Stitch the co-expression and key plots together
CombinePlots is being deprecated. Plots should now be combined using the patchwork system.
inspect mutants in 13 that are in between sexes
mutant_sex_cells <- rownames(tenx.mutant.integrated.sex[which(tenx.mutant.integrated.sex$genotype_combined == "Mutant"),])
Keys should be one or more alphanumeric characters followed by an underscore, setting key from umapoptimised_post_repca_ to umapoptimisedpostrepca_
Ultimately, we are looking for coexpression after the expression of AP2G:
## plot
marker_gene_plot_AP2G <- FeaturePlot(tenx.mutant.integrated.sex, features = "PBANKA-1437500", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("AP2G Expression")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
Scale for 'colour' is already present. Adding another scale for
'colour', which will replace the existing scale.
##view
marker_gene_plot_AP2G
and we have the following clusters currently:
## Plot
umap_cluster <- DimPlot(tenx.mutant.integrated.sex, label = TRUE, label.size = 8, repel = FALSE, pt.size = 0.5, dims = c(2,1), reduction = "DIM_UMAP") +
coord_fixed() +
theme(legend.position="bottom",
axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank()) +
guides(colour=guide_legend(nrow = 3, byrow = TRUE, override.aes = list(size=5)))
## print
umap_cluster
13 is the common sex branch so we will interrogate that branch
## subset cluster 13 cells
cells_thirteen <- rownames(tenx.mutant.integrated.sex@meta.data[tenx.mutant.integrated.sex@meta.data$seurat_clusters == 13, ])
## subset cluster 13
seurat_thirteen <- subset(tenx.mutant.integrated.sex, cells = cells_thirteen)
Keys should be one or more alphanumeric characters followed by an underscore, setting key from umapoptimised_post_repca_ to umapoptimisedpostrepca_
## re-PCA
seurat_thirteen <- RunPCA(seurat_thirteen, npcs = 30, verbose = FALSE)
ElbowPlot(seurat_thirteen, ndims = 30, reduction = "pca")
## recluster
seurat_thirteen <- FindNeighbors(seurat_thirteen, dims = 1:15)
Computing nearest neighbor graph
Computing SNN
seurat_thirteen <- FindClusters(seurat_thirteen, resolution = 1, random.seed = 42, algorithm = 2)
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 222
Number of edges: 10912
Running Louvain algorithm with multilevel refinement...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.3347
Number of communities: 3
Elapsed time: 0 seconds
## plot cluster resolution = 1
## plot
DimPlot(seurat_thirteen, reduction = "DIM_UMAP", dims = c(2,1), label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.1") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
2 and 1 are superimposed on the branch, lets find markers for each cluster
markers_subset <- seurat_thirteen_markers %>% group_by(cluster) %>% top_n(n = 50, wt = avg_logFC) %>% filter(p_val_adj < 0.05)
markers_subset
## find markers
markers_subset_mutant <- markers_subset[which(markers_subset$gene %in% list_of_mutant_genes), ]
markers_subset_mutant
## plot
VlnPlot(seurat_thirteen, features = c(markers_subset_mutant$gene, "PBANKA-1437500"), assay = "RNA")
Specifically look for markers between 1 and 2
Just check that mutants aren’t interfering with the analysis i.e. the clusters don’t just belong to a certain genotype or mutant cell
table(seurat_thirteen$seurat_clusters, seurat_thirteen$identity_combined)
GCSKO-28 GCSKO-3 WT WT_10X
0 0 3 1 72
1 2 4 1 69
2 0 6 0 64
Do the same analysis on cluster 3 quickly
## find markers
cells_three_markers <- FindAllMarkers(cells_three, only.pos = FALSE, min.pct = 0.25, logfc.threshold = 0.25)
Calculating cluster 0
| | 0 % ~calculating
|++ | 2 % ~00s
|+++ | 5 % ~00s
|++++ | 7 % ~00s
|+++++ | 10% ~00s
|++++++ | 12% ~00s
|++++++++ | 14% ~00s
|+++++++++ | 17% ~00s
|++++++++++ | 19% ~00s
|+++++++++++ | 21% ~00s
|++++++++++++ | 24% ~00s
|++++++++++++++ | 26% ~00s
|+++++++++++++++ | 29% ~00s
|++++++++++++++++ | 31% ~00s
|+++++++++++++++++ | 33% ~00s
|++++++++++++++++++ | 36% ~00s
|++++++++++++++++++++ | 38% ~00s
|+++++++++++++++++++++ | 40% ~00s
|++++++++++++++++++++++ | 43% ~00s
|+++++++++++++++++++++++ | 45% ~00s
|++++++++++++++++++++++++ | 48% ~00s
|+++++++++++++++++++++++++ | 50% ~00s
|+++++++++++++++++++++++++++ | 52% ~00s
|++++++++++++++++++++++++++++ | 55% ~00s
|+++++++++++++++++++++++++++++ | 57% ~00s
|++++++++++++++++++++++++++++++ | 60% ~00s
|+++++++++++++++++++++++++++++++ | 62% ~00s
|+++++++++++++++++++++++++++++++++ | 64% ~00s
|++++++++++++++++++++++++++++++++++ | 67% ~00s
|+++++++++++++++++++++++++++++++++++ | 69% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|+++++++++++++++++++++++++++++++++++++ | 74% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=00s
Calculating cluster 1
| | 0 % ~calculating
|+ | 2 % ~00s
|++ | 3 % ~00s
|+++ | 5 % ~00s
|++++ | 6 % ~00s
|+++++ | 8 % ~00s
|+++++ | 10% ~00s
|++++++ | 11% ~00s
|+++++++ | 13% ~00s
|++++++++ | 15% ~00s
|+++++++++ | 16% ~01s
|+++++++++ | 18% ~01s
|++++++++++ | 19% ~00s
|+++++++++++ | 21% ~01s
|++++++++++++ | 23% ~00s
|+++++++++++++ | 24% ~01s
|+++++++++++++ | 26% ~00s
|++++++++++++++ | 27% ~00s
|+++++++++++++++ | 29% ~00s
|++++++++++++++++ | 31% ~00s
|+++++++++++++++++ | 32% ~00s
|+++++++++++++++++ | 34% ~00s
|++++++++++++++++++ | 35% ~00s
|+++++++++++++++++++ | 37% ~00s
|++++++++++++++++++++ | 39% ~00s
|+++++++++++++++++++++ | 40% ~00s
|+++++++++++++++++++++ | 42% ~00s
|++++++++++++++++++++++ | 44% ~00s
|+++++++++++++++++++++++ | 45% ~00s
|++++++++++++++++++++++++ | 47% ~00s
|+++++++++++++++++++++++++ | 48% ~00s
|+++++++++++++++++++++++++ | 50% ~00s
|++++++++++++++++++++++++++ | 52% ~00s
|+++++++++++++++++++++++++++ | 53% ~00s
|++++++++++++++++++++++++++++ | 55% ~00s
|+++++++++++++++++++++++++++++ | 56% ~00s
|++++++++++++++++++++++++++++++ | 58% ~00s
|++++++++++++++++++++++++++++++ | 60% ~00s
|+++++++++++++++++++++++++++++++ | 61% ~00s
|++++++++++++++++++++++++++++++++ | 63% ~00s
|+++++++++++++++++++++++++++++++++ | 65% ~00s
|++++++++++++++++++++++++++++++++++ | 66% ~00s
|++++++++++++++++++++++++++++++++++ | 68% ~00s
|+++++++++++++++++++++++++++++++++++ | 69% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|+++++++++++++++++++++++++++++++++++++ | 73% ~00s
|++++++++++++++++++++++++++++++++++++++ | 74% ~00s
|++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 82% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 2
| | 0 % ~calculating
|+ | 1 % ~01s
|++ | 3 % ~01s
|++ | 4 % ~01s
|+++ | 5 % ~01s
|++++ | 6 % ~01s
|++++ | 8 % ~01s
|+++++ | 9 % ~01s
|++++++ | 10% ~01s
|++++++ | 11% ~01s
|+++++++ | 13% ~01s
|+++++++ | 14% ~01s
|++++++++ | 15% ~01s
|+++++++++ | 16% ~01s
|+++++++++ | 18% ~01s
|++++++++++ | 19% ~01s
|+++++++++++ | 20% ~01s
|+++++++++++ | 22% ~01s
|++++++++++++ | 23% ~01s
|+++++++++++++ | 24% ~01s
|+++++++++++++ | 25% ~01s
|++++++++++++++ | 27% ~01s
|++++++++++++++ | 28% ~01s
|+++++++++++++++ | 29% ~01s
|++++++++++++++++ | 30% ~01s
|++++++++++++++++ | 32% ~01s
|+++++++++++++++++ | 33% ~01s
|++++++++++++++++++ | 34% ~01s
|++++++++++++++++++ | 35% ~01s
|+++++++++++++++++++ | 37% ~01s
|+++++++++++++++++++ | 38% ~01s
|++++++++++++++++++++ | 39% ~01s
|+++++++++++++++++++++ | 41% ~01s
|+++++++++++++++++++++ | 42% ~01s
|++++++++++++++++++++++ | 43% ~01s
|+++++++++++++++++++++++ | 44% ~01s
|+++++++++++++++++++++++ | 46% ~01s
|++++++++++++++++++++++++ | 47% ~01s
|+++++++++++++++++++++++++ | 48% ~01s
|+++++++++++++++++++++++++ | 49% ~01s
|++++++++++++++++++++++++++ | 51% ~01s
|++++++++++++++++++++++++++ | 52% ~01s
|+++++++++++++++++++++++++++ | 53% ~01s
|++++++++++++++++++++++++++++ | 54% ~01s
|++++++++++++++++++++++++++++ | 56% ~01s
|+++++++++++++++++++++++++++++ | 57% ~01s
|++++++++++++++++++++++++++++++ | 58% ~01s
|++++++++++++++++++++++++++++++ | 59% ~01s
|+++++++++++++++++++++++++++++++ | 61% ~01s
|++++++++++++++++++++++++++++++++ | 62% ~01s
|++++++++++++++++++++++++++++++++ | 63% ~00s
|+++++++++++++++++++++++++++++++++ | 65% ~00s
|+++++++++++++++++++++++++++++++++ | 66% ~00s
|++++++++++++++++++++++++++++++++++ | 67% ~00s
|+++++++++++++++++++++++++++++++++++ | 68% ~00s
|+++++++++++++++++++++++++++++++++++ | 70% ~00s
|++++++++++++++++++++++++++++++++++++ | 71% ~00s
|+++++++++++++++++++++++++++++++++++++ | 72% ~00s
|+++++++++++++++++++++++++++++++++++++ | 73% ~00s
|++++++++++++++++++++++++++++++++++++++ | 75% ~00s
|++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 80% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 81% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 82% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Calculating cluster 3
| | 0 % ~calculating
|+ | 1 % ~01s
|++ | 2 % ~01s
|++ | 3 % ~01s
|+++ | 4 % ~01s
|+++ | 5 % ~01s
|++++ | 7 % ~01s
|++++ | 8 % ~01s
|+++++ | 9 % ~01s
|+++++ | 10% ~01s
|++++++ | 11% ~01s
|++++++ | 12% ~01s
|+++++++ | 13% ~01s
|++++++++ | 14% ~01s
|++++++++ | 15% ~01s
|+++++++++ | 16% ~01s
|+++++++++ | 17% ~01s
|++++++++++ | 18% ~01s
|++++++++++ | 20% ~01s
|+++++++++++ | 21% ~01s
|+++++++++++ | 22% ~01s
|++++++++++++ | 23% ~01s
|++++++++++++ | 24% ~01s
|+++++++++++++ | 25% ~01s
|++++++++++++++ | 26% ~01s
|++++++++++++++ | 27% ~01s
|+++++++++++++++ | 28% ~01s
|+++++++++++++++ | 29% ~01s
|++++++++++++++++ | 30% ~01s
|++++++++++++++++ | 32% ~01s
|+++++++++++++++++ | 33% ~01s
|+++++++++++++++++ | 34% ~01s
|++++++++++++++++++ | 35% ~01s
|++++++++++++++++++ | 36% ~01s
|+++++++++++++++++++ | 37% ~01s
|++++++++++++++++++++ | 38% ~01s
|++++++++++++++++++++ | 39% ~01s
|+++++++++++++++++++++ | 40% ~01s
|+++++++++++++++++++++ | 41% ~01s
|++++++++++++++++++++++ | 42% ~01s
|++++++++++++++++++++++ | 43% ~01s
|+++++++++++++++++++++++ | 45% ~01s
|+++++++++++++++++++++++ | 46% ~01s
|++++++++++++++++++++++++ | 47% ~01s
|++++++++++++++++++++++++ | 48% ~01s
|+++++++++++++++++++++++++ | 49% ~01s
|+++++++++++++++++++++++++ | 50% ~01s
|++++++++++++++++++++++++++ | 51% ~01s
|+++++++++++++++++++++++++++ | 52% ~01s
|+++++++++++++++++++++++++++ | 53% ~01s
|++++++++++++++++++++++++++++ | 54% ~01s
|++++++++++++++++++++++++++++ | 55% ~01s
|+++++++++++++++++++++++++++++ | 57% ~01s
|+++++++++++++++++++++++++++++ | 58% ~01s
|++++++++++++++++++++++++++++++ | 59% ~01s
|++++++++++++++++++++++++++++++ | 60% ~01s
|+++++++++++++++++++++++++++++++ | 61% ~01s
|+++++++++++++++++++++++++++++++ | 62% ~01s
|++++++++++++++++++++++++++++++++ | 63% ~01s
|+++++++++++++++++++++++++++++++++ | 64% ~01s
|+++++++++++++++++++++++++++++++++ | 65% ~01s
|++++++++++++++++++++++++++++++++++ | 66% ~01s
|++++++++++++++++++++++++++++++++++ | 67% ~01s
|+++++++++++++++++++++++++++++++++++ | 68% ~01s
|+++++++++++++++++++++++++++++++++++ | 70% ~01s
|++++++++++++++++++++++++++++++++++++ | 71% ~01s
|++++++++++++++++++++++++++++++++++++ | 72% ~01s
|+++++++++++++++++++++++++++++++++++++ | 73% ~00s
|+++++++++++++++++++++++++++++++++++++ | 74% ~00s
|++++++++++++++++++++++++++++++++++++++ | 75% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 76% ~00s
|+++++++++++++++++++++++++++++++++++++++ | 77% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 78% ~00s
|++++++++++++++++++++++++++++++++++++++++ | 79% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 80% ~00s
|+++++++++++++++++++++++++++++++++++++++++ | 82% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 83% ~00s
|++++++++++++++++++++++++++++++++++++++++++ | 84% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 85% ~00s
|+++++++++++++++++++++++++++++++++++++++++++ | 86% ~00s
|++++++++++++++++++++++++++++++++++++++++++++ | 87% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 88% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++ | 89% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 90% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++ | 91% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 92% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++ | 93% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 97% ~00s
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99% ~00s
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=02s
## inspect result
markers_subset <- cells_three_markers %>% group_by(cluster) %>% top_n(n = 50, wt = avg_logFC) %>% filter(p_val_adj < 0.05)
markers_subset
## find markers
markers_subset_mutant <- markers_subset[which(markers_subset$gene %in% list_of_mutant_genes), ]
markers_subset_mutant
## plot
VlnPlot(cells_three, features = c(markers_subset_mutant$gene, "PBANKA-1437500"), assay = "RNA")
df_plot <- data.frame(t(data.frame(cells_three@assays$RNA[c("PBANKA-1447900", "PBANKA-1437500"), ])))
ggplot(df_plot, aes(PBANKA.1447900, PBANKA.1437500)) + geom_point()
## extracts only 10x cells and also remove cluster 0 cells
wt_cells <- rownames(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$identity_combined == "WT_10X" & !tenx.mutant.integrated.sex@meta.data$seurat_clusters == "0"),])
## make a new Seurat of this
seurat.object <-subset(tenx.mutant.integrated.sex, cells = wt_cells)
Keys should be one or more alphanumeric characters followed by an underscore, setting key from umapoptimised_post_repca_ to umapoptimisedpostrepca_
## check that this is the same as the pb_sex_filtered object
#data_test <- as(as.matrix(GetAssayData(pb_sex_filtered, assay = "RNA", slot = "data")), 'sparseMatrix')
#is.equal
#is.identical
## extract counts and pheno:
## the reason we use the integrated and then subsetted is because these cells have been normalised whereas the cells in pb_sex_filtered have not been normalised (well they have but with doublets in them)
data <- as(as.matrix(GetAssayData(seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
## make phenodata
pd <- data.frame(seurat.object@meta.data)
## keep only the columns that are relevant in metadata
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
## make gene metadata
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
monocle.object <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
monocle.object = preprocess_cds(monocle.object, num_dim = 50, norm_method = "none")
### if using integrated data:
# norm_method = "none", alignment_group = "~ experiment"
## plot jack straw plot
#plot_pc_variance_explained(monocle.object)
#monocle.object = reduce_dimension(monocle.object, reduction_method = "UMAP", preprocess_method = "PCA", umap.metric = "euclidean", umap.n_neighbors = 50, umap.min_dist = 0.5, verbose = FALSE)
#plot_cells(monocle.object, color_cells_by="experiment")
## graph learning
## add UMAP from Seurat
monocle.object@int_colData@listData$reducedDims@listData[["UMAP"]] <- seurat.object@reductions[["DIM_UMAP"]]@cell.embeddings
## cluster
## this is essential to run the learn_graph function later on
monocle.object = cluster_cells(monocle.object)
## plot initial clustering by monocle
#plot_cells(monocle.object, color_cells_by="cluster", group_cells_by="partition", x = 2, y = 1)
## map pseudotime
monocle.object = learn_graph(monocle.object, learn_graph_control=list(ncenter=500), use_partition = FALSE)
|
| | 0%
|
|==============================================================| 100%
# learn_graph_control=list(ncenter=500) - play with this parameter
## Plot cells
plot_cells(monocle.object, color_cells_by="partition", group_cells_by="partition", x = 2, y = 1)
Cells aren't colored in a way that allows them to be grouped.
save
ggsave("../images_to_export/GCSKO_sexbranch_umap_pt.png", plot = umap_pt, device = "png", path = NULL, scale = 1, width = 10, height = 10, units = "cm", dpi = 300, limitsize = TRUE)
Just check if the ‘hook’ on the males is a lineage or dead/dying/activated gams, or mutants
FeaturePlot(tenx.mutant.integrated.sex, features = "PBANKA-0416100", coord.fixed = TRUE, min.cutoff = "q1", dims = c(2,1), reduction = "DIM_UMAP", pt.size = 1, order = TRUE) +
theme_void() +
labs(title = paste("MG1 (Male)")) +
theme(plot.title = element_text(hjust = 0.5, family="Arial", size = 20, face = "bold")) +
scale_colour_gradientn(colours=c("#DCDCDC", plasma(30)))
Scale for 'colour' is already present. Adding another scale for
'colour', which will replace the existing scale.
It appears MG1 marker expression is decreasing/off in these cells - inspect mutant status
plot <- DimPlot(tenx.mutant.integrated.sex, label = FALSE, label.size = 8, repel = FALSE, pt.size = 0.05, dims = c(2,1), reduction = "DIM_UMAP", group.by = "identity_updated") +
coord_fixed() +
theme(legend.position="bottom",
axis.line=element_blank(),
axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.ticks=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank()) +
guides(colour=guide_legend(nrow = 3, byrow = TRUE, override.aes = list(size=5)))
HoverLocator(plot = plot, information = FetchData(tenx.mutant.integrated.sex, vars = c("ident", "identity_updated", "nFeature_RNA")))
the condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be used`error_y.color` does not currently support multiple values.`error_x.color` does not currently support multiple values.`line.color` does not currently support multiple values.The titlefont attribute is deprecated. Use title = list(font = ...) instead.the condition has length > 1 and only the first element will be usedthe condition has length > 1 and only the first element will be used`error_y.color` does not currently support multiple values.`error_x.color` does not currently support multiple values.`line.color` does not currently support multiple values.The titlefont attribute is deprecated. Use title = list(font = ...) instead.
when pseudotime was calcualted on the whole object
library(ggpubr)
## extract pseudotime values:
pt_values_new <- as.data.frame(pseudotime(monocle.object, reduction_method = "UMAP"))
pt_values_new$cell_name <- rownames(pt_values_new)
meta_data_df <- as.data.frame(monocle.object@colData)
meta_data_df$cell_name <- rownames(meta_data_df)
meta_data_df <- merge(meta_data_df, pt_values_new, by = "cell_name")
names(meta_data_df)[ncol(meta_data_df)]<- "pt"
ggplot(meta_data_df, aes(x = old_pt_values, y = pt, colour = cluster_colours_figure)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
theme_classic() + stat_cor(method = "pearson")
This shows very good correlation, and the females have a slightly different correlation but we can see that the branches fit well on the UMAP.
## access the closest principal graph node vertex for each cell and assign it as a column in your colData table using
colData(monocle.object)$closest_vertex <- monocle.object@principal_graph_aux[["UMAP"]]$pr_graph_cell_proj_closest_vertex[,1]
## plot
plot_cells(monocle.object, color_cells_by = "closest_vertex", label_cell_groups = FALSE)
gene_module_df_sex <- find_gene_modules(monocle.object[pr_deg_ids,], resolution=c(10^seq(-6,2)), random_seed = 1234)
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
Found more than one class "dist" in cache; using the first, from namespace 'spam'
Also defined by ‘BiocGenerics’
## how many genes in modules?
dim(gene_module_df_sex)
[1] 3260 5
[1] "there are 16 modules"
Make a dataframe to plot by aggregating clusters vs. modules
## make cell group df
cell_group_df <- tibble::tibble(cell=row.names(colData(monocle.object)), cell_group=colData(monocle.object)$seurat_clusters)
## make plotting df
agg_mat <- aggregate_gene_expression(monocle.object, gene_module_df_sex, cell_group_df)
Find out how many genes there are per total so we can add this to the plot
## how many genes per module?
genes_per_module <- as.data.frame(table(gene_module_df_sex$module))
genes_per_module
Find out which modules our mutant genes reside in
## create a list of our mutant gene IDs
list_of_mutant_genes <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800")
## make a dataframe to convert the gene IDs into actual IDs
df_mutant_ids <- as.data.frame(unique(cbind(tenx.mutant.integrated@meta.data$identity_gene_updated, tenx.mutant.integrated@meta.data$identity_updated)))[-c(1, 3),]
# remove the "820" bit on 10
df_mutant_ids$V1 <- gsub("_820", "", df_mutant_ids$V1)
# change the underscore (_) to a dash (-) in gene IDs
df_mutant_ids$V1 <- gsub("_", "-", df_mutant_ids$V1)
names(df_mutant_ids) <- c("gene_ID", "mutant_identity")
## subset modules df to include only mutant gene IDs
df_mutant_gene_modules <- as.data.frame(gene_module_df_sex[which(gene_module_df_sex$id %in% list_of_mutant_genes),])
names(df_mutant_gene_modules)[1] <- "gene_ID"
## merge dataframes
df_mutant_gene_modules <- merge(df_mutant_gene_modules, df_mutant_ids, by = "gene_ID")
## Inspect
df_mutant_gene_modules
so modules of interest are:
table(df_mutant_gene_modules$module)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
4 0 0 0 0 0 0 0 0 2 2 1 0 0 1 0 0 0
Which modules do other genes of interest lie in?:
## landmark genes (genes of interest)
# AP2G - PBANKA-1437500
# AP2 - PBANKA-0909600 - from poran paper
# AP2G-2 - PBANKA-1034300
# ccp2 - "PBANKA-1319500" - female 820
# p25 - "PBANKA-0515000" - female
# p28 - "PBANKA-0514900" - female
# ccp3 - "PBANKA-1035200" - female - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5909122/
# nek4 - "PBANKA-0616700" - female
# ap2-fg - "PBANKA-1415700" - female
# dozi - "PBANKA-1217700" - female
# MG1 - "PBANKA-0416100" - male 820
# hap2 - "PBANKA-1212600" - male
# MAPK2 - "PBANKA-0933700" - male
# nek1 - "PBANKA-1443000" - male
# cdpk4 - "PBANKA-0615200" - male
## create a list of landmark genes
list_of_landmark_genes <- c("PBANKA-1437500",
"PBANKA-0909600",
"PBANKA-1034300",
"PBANKA-1319500",
"PBANKA-0515000",
"PBANKA-0514900",
"PBANKA-1035200",
"PBANKA-0616700",
"PBANKA-1415700",
"PBANKA-1217700",
"PBANKA-0416100",
"PBANKA-1212600",
"PBANKA-0933700",
"PBANKA-1443000",
"PBANKA-0615200")
name_of_landmark_genes <- c("AP2-G",
"AP2_poran",
"AP2-G2",
"ccp2",
"p25",
"p28",
"ccp3",
"nek4",
"ap2-fg",
"DOZI",
"mg1",
"hap2",
"mapk2",
"nek1",
"cdpk4")
## make a df
name_of_landmark_genes <- data.frame("gene_name" = name_of_landmark_genes, "id" = list_of_landmark_genes)
## make dataframe
df_landmark_gene_modules <- gene_module_df_sex[which(gene_module_df_sex$id %in% list_of_landmark_genes),]
## merge dataframes
df_landmark_gene_modules <- merge(df_landmark_gene_modules, name_of_landmark_genes, by = "id")
## inspect
df_landmark_gene_modules
enrichment of screen hits in modules
DOZI-regulated genes
Find out how many of the genes in each module has a DOZI-regulated gene
DOZI_regulated_genes <-
read.csv(file = "../data/Reference/DOZI_regulated_genes.csv", header = TRUE)
## extract gene IDs
dozi_genes <- DOZI_regulated_genes[DOZI_regulated_genes$DOZI_regulated. == "YES", ]$Gene_ID_PB
## change the underscore to a dash
dozi_genes <- gsub("_", "-", dozi_genes)
## find out which modules they are in
df_dozi_gene_modules <- gene_module_df_sex[which(gene_module_df_sex$id %in% dozi_genes),]
print("dozi genes by module")
[1] "dozi genes by module"
table(df_dozi_gene_modules$module)
8 7 6 3 12 15 4 16 1 2 18 13 17 5 9 14 10 11
4 7 10 10 6 5 11 6 10 122 23 10 8 2 2 0 9 4
plot out modules
UMAP_modules <- plot_cells(monocle.object, genes=gene_module_df_sex %>% filter(module %in% c(1:20)),
cell_size = 2,
x = 2, y = 1,
label_cell_groups=FALSE,
scale_to_range = TRUE,
show_trajectory_graph=FALSE) +
scale_colour_viridis_c(name = "expression", option = "C", alpha = 1) +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom", strip.text.x = element_text(size = 15))
Scale for 'colour' is already present. Adding another scale for
'colour', which will replace the existing scale.
## view
UMAP_modules
save
ggsave("../images_to_export/GCSKO_sexbranch_umap_modules.png", plot = UMAP_modules, device = "png", path = NULL, scale = 1, width = 30, height = 30, units = "cm", dpi = 300, limitsize = TRUE)
## make a dataframe of genes per module
genes_per_module <- as.data.frame(table(gene_module_df_sex$module))
## inspect
genes_per_module
save
ggsave("../images_to_export/GCSKO_sexbranch_heatmap.png", plot = heatmap_main, device = "png", path = NULL, scale = 1, width = 27, height = 20, units = "cm", dpi = 300, limitsize = TRUE)
## landmark genes (genes of interest)
# AP2G - PBANKA-1437500
# AP2 - PBANKA-0909600 - from poran paper
# AP2G-2 - PBANKA-1034300
#list_of_mutant_genes <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800")
## define genes to plot
list_of_genes_of_interest <- c("PBANKA-0102400", "PBANKA-0413400", "PBANKA-0716500", "PBANKA-0828000", "PBANKA-0902300", "PBANKA-1144800", "PBANKA-1302700", "PBANKA-1418100", "PBANKA-1435200", "PBANKA-1447900", "PBANKA-1454800", "PBANKA-1437500", "PBANKA-1319500", "PBANKA-0416100", "PBANKA-1034300")
## add names
names_of_genes_of_interest <- c("GCSKO-2", "GCSKO-10", "GCSKO-19", "GCSKO-3", "GCSKO-13", "GCSKO-28", "GCSKO-oom", "GCSKO-17", "GCSKO-20", "GCSKO-29", "GCSKO-21", "AP2-G", "CCP2", "MG1", "AP2-G2")
##make df for genes of interest
genes_of_interest <- data.frame(gene = list_of_genes_of_interest, group = c(1:length(list_of_genes_of_interest)), name = names_of_genes_of_interest)
## reorder
#genes_of_interest <- genes_of_interest[match(c("AP2-G", "CCP2", "GCSKO-21", "GCSKO-17", "GCSKO-2", "MG1", "GCSKO-20", "GCSKO-3", "GCSKO-oom", "GCSKO-29", "GCSKO-10", "GCSKO-28", "GCSKO-19", "GCSKO-13"), genes_of_interest$name), ]
## prepare custom dataframe for all cells by modules:
agg_mat_genes_of_interest <- aggregate_gene_expression(monocle.object, genes_of_interest[,1:2])
## reorder using new order
agg_mat_genes_of_interest <- agg_mat_genes_of_interest[,col.order]
## plot
heatmap_plot <- pheatmap::pheatmap(agg_mat_genes_of_interest,
#scale="row",
cluster_cols = FALSE,
cluster_rows = TRUE,
clustering_method="ward.D2",
show_colnames = FALSE,
labels_row = as.character(genes_of_interest[,3]),
gaps_col = c(length(asex_cells), length(c(asex_cells,bi_cells)), length(c(asex_cells,bi_cells,female_cells))),
#gaps_row = c(1, 6),
cutree_rows = 2,
## trying to fix legend issue here
#fontsize_row = 10,
#fontsize_col = 3,
#cellheight=3,
#cellwidth = 3,
legend = TRUE,
annotation_legend = FALSE,
annotation_col = anno_no_group,
annotation_colors = annotation_colours
)
heatmap_plot
look at some of the “module” genes now too
## landmark genes (genes of interest)
# AP2G - PBANKA-1437500
# AP2 - PBANKA-0909600 - from poran paper
# AP2G-2 - PBANKA-1034300
#list_of_mutant_genes <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800")
## define genes to plot
list_of_genes_of_interest <- c("PBANKA-1454000", "PBANKA-1354000")
## add names
names_of_genes_of_interest <- c("GyrB", "DBR1")
##make df for genes of interest
genes_of_interest <- data.frame(gene = list_of_genes_of_interest, group = c(1:length(list_of_genes_of_interest)), name = names_of_genes_of_interest)
## reorder
#genes_of_interest <- genes_of_interest[match(c("AP2-G", "CCP2", "GCSKO-21", "GCSKO-17", "GCSKO-2", "MG1", "GCSKO-20", "GCSKO-3", "GCSKO-oom", "GCSKO-29", "GCSKO-10", "GCSKO-28", "GCSKO-19", "GCSKO-13"), genes_of_interest$name), ]
## prepare custom dataframe for all cells by modules:
agg_mat_genes_of_interest <- aggregate_gene_expression(monocle.object, genes_of_interest[,1:2])
## reorder using new order
agg_mat_genes_of_interest <- agg_mat_genes_of_interest[,col.order]
## plot
heatmap_plot <- pheatmap::pheatmap(agg_mat_genes_of_interest,
#scale="row",
cluster_cols = FALSE,
cluster_rows = TRUE,
clustering_method="ward.D2",
show_colnames = FALSE,
labels_row = as.character(genes_of_interest[,3]),
gaps_col = c(length(asex_cells), length(c(asex_cells,bi_cells)), length(c(asex_cells,bi_cells,female_cells))),
#gaps_row = c(1, 6),
cutree_rows = 2,
## trying to fix legend issue here
#fontsize_row = 10,
#fontsize_col = 3,
#cellheight=3,
#cellwidth = 3,
legend = TRUE,
annotation_legend = FALSE,
annotation_col = anno_no_group,
annotation_colors = annotation_colours
)
heatmap_plot
Make annotations to add to the heatmap
## change names for row names to include "module " at the begining of them
row.names(agg_mat) <- stringr::str_c("Module ", row.names(agg_mat))
## add number of cells to the rownames for the module
for(i in seq_along(genes_per_module$Freq)){
row.names(agg_mat)[i] <- stringr::str_c(row.names(agg_mat)[i]," (n = " ,genes_per_module$Freq[i], ")")
}
## create annotation of clusters for pheatmap:
cluster_anno <- data.frame(cluster = unique(colData(monocle.object)$seurat_clusters))
row.names(cluster_anno) <- cluster_anno$cluster
## clusters were defined earlier as:
male_clusters <- c("13", "5", "11", "25", "1", "17", "10", "24", "14", "27", "19")
female_clusters <- c("16", "22", "23", "15", "21", "30", "4", "3", "18", "7", "8", "6", "20", "12", "26")
asex_clusters <- c("0", "9", "28", "2", "29")
## add identities to the column
cluster_anno$group <- NA
cluster_anno$group[which(cluster_anno$cluster %in% asex_clusters)] <- "Asexual"
cluster_anno$group[which(cluster_anno$cluster %in% male_clusters)] <- "Male"
cluster_anno$group[which(cluster_anno$cluster %in% female_clusters)] <- "Female"
cluster_anno <- cluster_anno[ , 2, drop = FALSE]
cluster_anno
## add median pseudotime per cluster
## help here:
## https://stackoverflow.com/questions/54360855/calculate-mean-for-column-grouped-by-values-of-two-other-columns
## make subsetted dataframe
df_median_pt <- meta_data_df[ ,c("pt", "seurat_clusters")]
## apply across dataframe to get median
mean.df1 <- tapply(df_median_pt$pt, list(df_median_pt$seurat_clusters), median)
mean.df2 <- as.data.frame(as.table(mean.df1))
names(mean.df2) <- c("seurat_clusters", "pt_Median")
rownames(mean.df2) <- mean.df2$seurat_clusters
## to make each value have the mean in the OG dataframe
#merge(df_median_pt, mean.df2)
## add to annotation dataframe
cluster_anno <- merge(cluster_anno, mean.df2, by=0)
## add rownames to dataframe
rownames(cluster_anno) <- cluster_anno$Row.names
## subset to have only info of interest
cluster_anno <- cluster_anno[,c(2,4)]
names(cluster_anno) <- c("Identity", "Median_Pseudotime_of_Cluster")
## make annotation colours
annotation_colours <- list(Identity = c(Male="#016c00", Female="#a52b1e", Asexual= "#0052c5"),
Median_Pseudotime_of_Cluster = magma(12, direction = 1))
## reorder the levels
## make df of data
agg_mat_df <- as.data.frame(agg_mat)
## remove levels in my_levels that are not present here - i.e. clusters that are missing because they are not represented in the 10X data
my_levels_10x_data <- my_levels_sex[which(my_levels_sex %in% colnames(agg_mat_df))]
## sort the values
agg_mat_df <- agg_mat_df[ ,(match(my_levels_10x_data, colnames(agg_mat_df)))]
## order
#cluster_anno <- cluster_anno[(match(my_levels_10x_data, rownames(cluster_anno))), ]
## reorder columns
## first, order the annotation
cluster_anno <- cluster_anno[with(cluster_anno, order(Identity, Median_Pseudotime_of_Cluster)),]
## remove the NAs from this
cluster_anno <- cluster_anno[complete.cases(cluster_anno),]
agg_mat_df <- agg_mat_df[ ,match(rownames(cluster_anno), colnames(agg_mat_df))]
## plot heatmap
pheatmap::pheatmap(agg_mat_df,
scale="row",
cluster_cols = FALSE,
clustering_method="ward.D2",
annotation_col = cluster_anno,
annotation_colors = annotation_colours,
cutree_rows = 12)
#, gaps_col = c(28,29,37)
#row.names(agg_mat) <- factor(row.names(agg_mat), levels = row.names(agg_mat)[module_dendro$order])
## plot heatmap
#pheatmap::pheatmap(agg_mat_df,
# scale="column",
# cluster_cols = TRUE,
# cluster_rows = module_dendro,
# #clustering_method="ward.D2",
# cutree_rows = 5,
# annotation_col = cluster_anno,
# annotation_colors = annotation_colours) +
# theme(legend.position = "bottom")
All Cells
Side plots
construct new dataframes for the cells from mutants for each sex
## The original object contains all cells, we just want wild-type so let's subset out gene_module_df and cell_group_df accordingly
## male
## subset out only male and pre determination cells
male_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$at_sex == "male"), ]
## take forward only smart-seq2
male_cells <- male_cells[which(male_cells$experiment == "mutants"), ]
## get cell names
male_cells <- rownames(male_cells)
## subset Seurat object to contain cells of interest
male.seurat.object <- SubsetData(tenx.mutant.integrated.sex, cells = male_cells)
## make new counts and pheno:
data <- as(as.matrix(GetAssayData(male.seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
pd <- data.frame(male.seurat.object@meta.data)
## keep only the columns that are relevant
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
male.monocle.object <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
male.monocle.object = preprocess_cds(male.monocle.object, num_dim = 50, norm_method = "none")
## make a new dataframe for cell groups - it is crucial to refactor otherwise aggregate_gene_expression thinks it's out of bounds
#male_cell_group_df <- data.frame(cell=as.character(factor(male_cell_group_df$cell_id)), cell_group=factor(male_cell_group_df$pt_bin))
## aggregate expression
male_agg_mat <- aggregate_gene_expression(male.monocle.object, gene_module_df_sex, exclude.na = FALSE)
## female
## subset out only male and pre determination cells
female_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$at_sex == "female"), ]
## take forward only wild-type
female_cells <- female_cells[which(female_cells$experiment == "mutants"), ]
## get cell names
female_cells <- rownames(female_cells)
## subset our cell group df to keep only these cells
#female_cell_group_df <- female_cell_group_df[which(female_cell_group_df$cell_id %in% female_cells),]
## subset Seurat object to contain cells of interest
female.seurat.object <- SubsetData(tenx.mutant.integrated.sex, cells = female_cells)
## make new counts and pheno:
data <- as(as.matrix(GetAssayData(female.seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
pd <- data.frame(female.seurat.object@meta.data)
## keep only the columns that are relevant
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
female.monocle.object <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
female.monocle.object = preprocess_cds(female.monocle.object, num_dim = 50, norm_method = "none")
## make a new dataframe for cell groups - it is crucial to refactor otherwise aggregate_gene_expression thinks it's out of bounds
#female_cell_group_df <- data.frame(cell=as.character(factor(female_cell_group_df$cell_id)), cell_group=factor(female_cell_group_df$pt_bin))
## aggregate expression
female_agg_mat <- aggregate_gene_expression(female.monocle.object, gene_module_df_sex, exclude.na = FALSE)
male
# male_agg_mat
## reorder using new order
male_agg_mat <- male_agg_mat[row.order, ]
## make an anotation
anno_male <- data.frame(male.monocle.object@colData$at_sex, male.monocle.object@colData$old_pt_values, genotype = male.monocle.object@colData$identity_updated, row.names = rownames(male.monocle.object@colData))
names(anno_male) <- c("sex", "Pseudotime", "genotype")
## make annotation colours
annotation_colours <- list(sex = c(male="#016c00", female="#a52b1e", 'pre-det' = "#0052c5"),
Pseudotime = magma(12, direction = 1))
## change the order of the data frame
col.order.male <- rownames(anno_male[with(anno_male, order(genotype, Pseudotime)), ])
male_agg_mat <- male_agg_mat[,col.order.male]
## plot
heatmap_male <- pheatmap::pheatmap(male_agg_mat,
#scale="row",
cluster_cols = FALSE,
cluster_rows = FALSE,
clustering_method="ward.D2",
show_colnames = FALSE,
legend = FALSE,
annotation_legend = TRUE,
annotation_col = anno_male,
annotation_colors = annotation_colours,
cutree_rows = 12)
heatmap_male
# female_agg_mat
## reorder using new order
female_agg_mat <- female_agg_mat[row.order, ]
## make an anotation
anno_female <- data.frame(female.monocle.object@colData$at_sex, female.monocle.object@colData$old_pt_values, genotype = female.monocle.object@colData$identity_updated, row.names = rownames(female.monocle.object@colData))
names(anno_female) <- c("sex", "Pseudotime", "genotype")
## make annotation colours
annotation_colours <- list(sex = c(male="#016c00", female="#a52b1e", 'pre-det' = "#0052c5"),
Pseudotime = magma(12, direction = 1))
## change the order of the data frame
col.order.female <- rownames(anno_female[with(anno_female, order(genotype, Pseudotime)), ])
female_agg_mat <- female_agg_mat[,col.order.female]
## plot
heatmap_female <- pheatmap::pheatmap(female_agg_mat,
#scale="row",
cluster_cols = FALSE,
cluster_rows = FALSE,
clustering_method="ward.D2",
show_colnames = FALSE,
legend = FALSE,
annotation_legend = FALSE,
annotation_col = anno_female,
annotation_colors = annotation_colours,
cutree_rows = 12)
heatmap_female
## save a pheatmap: https://stackoverflow.com/questions/43051525/how-to-draw-pheatmap-plot-to-screen-and-also-save-to-file
side plots with groups of mutant cells
female
## make a new grouping for cells based on their identity
mutant_group_female <- data.frame(cell = rownames(female.monocle.object@colData), cell_group = female.monocle.object@colData$identity_updated)
## aggregate expression
female_agg_mat_grouped <- aggregate_gene_expression(female.monocle.object, gene_module_df_sex, mutant_group_female, exclude.na = FALSE)
## reorder using new order
female_agg_mat_grouped <- female_agg_mat_grouped[row.order, ]
## plot
pheatmap::pheatmap(female_agg_mat_grouped,
#scale="row",
cluster_cols = TRUE,
cluster_rows = FALSE,
clustering_method="ward.D2",
show_colnames = TRUE,
legend = FALSE,
annotation_legend = FALSE,
#annotation_col = anno_female,
#annotation_colors = annotation_colours,
cutree_rows = 12)
male
module 12 inspection
## make a df for module 12 genes
module.12.genes <- gene_module_df_sex[gene_module_df_sex$module == 12, ]$id
module.12.genes <- data.frame(id = module.12.genes, group = module.12.genes)
## prepare custom dataframe for all cells by modules:
agg_mat_module_12 <- aggregate_gene_expression(monocle.object, module.12.genes)
## make an anotation
anno_no_group <- data.frame(monocle.object@colData$sex_UMAP, monocle.object@colData$old_pt_values, row.names = rownames(monocle.object@colData))
names(anno_no_group) <- c("sex", "Pseudotime")
## make annotation colours
annotation_colours <- list(sex = c(male="#016c00", female="#a52b1e", 'pre-det' = "#0052c5"),
Pseudotime = magma(12, direction = 1))
## change the order of the data frame
col.order <- c(pre_det_ordered_cells, female_ordered_cells, male_ordered_cells)
agg_mat_module_12 <- agg_mat_module_12[,col.order]
## plot
pheatmap::pheatmap(agg_mat_module_12,
#scale="row",
cluster_cols = FALSE,
clustering_method="ward.D2",
show_colnames = FALSE,
annotation_col = anno_no_group,
annotation_colors = annotation_colours,
fontsize_row = 7,
cutree_rows = 12)
save plot
heatmap_module_12 <- pheatmap::pheatmap(agg_mat_module_12,
#scale="row",
cluster_cols = FALSE,
clustering_method="ward.D2",
show_colnames = FALSE,
annotation_col = anno_no_group,
annotation_colors = annotation_colours,
fontsize_row = 7,
cutree_rows = 12)
AP2 Expression
## reading table of AP2 genes
ap2_genes_table <- read.delim(file = "~/data/AP2_genes_table.txt", header = TRUE, sep ="\t")
## extract list of genes
ap2_genes_list <- dplyr::pull(ap2_genes_table, Gene.ID)
ap2_genes_list <- gsub("_", "-", ap2_genes_list)
## make a df for genes
ap2_genes_list <- data.frame(id = ap2_genes_list, group = ap2_genes_list)
## prepare custom dataframe for all cells by modules:
agg_mat_ap2s <- aggregate_gene_expression(monocle.object, ap2_genes_list)
## change the order of the data frame
col.order <- c(pre_det_ordered_cells, female_ordered_cells, male_ordered_cells)
agg_mat_ap2s <- agg_mat_ap2s[,col.order]
## plot
pheatmap::pheatmap(agg_mat_ap2s,
#scale="row",
cluster_cols = FALSE,
clustering_method="complete",
show_colnames = FALSE,
annotation_col = anno_no_group,
annotation_colors = annotation_colours,
fontsize_row = 7,
cutree_rows = 3)
Read in Kasia’s modules:
## read in kasia modules:
kasia_clusters <- read.csv(file = "~/data/Modules_Clusters_Phenotypes.csv", header = TRUE)
## change _ to -:
kasia_clusters$new.gene.ID <- gsub("_", "-", kasia_clusters$new.gene.ID)
## filter out genes not in modules gene_module_df_sex:
kasia_clusters_filtered <- kasia_clusters[which(kasia_clusters$new.gene.ID %in% gene_module_df_sex$id), ]
## rename new gene id
names(kasia_clusters_filtered)[2] <- "id"
## merge together
modules_merged_df <- merge(kasia_clusters_filtered, gene_module_df_sex, by = "id")
## look at the enrichment with a dotplot:
dot_plot_df_pc <- (as.data.frame.matrix(prop.table(table(modules_merged_df$Kasia.Cluster, df_meta_data$identity_combined), margin = 2)) * 100)
NOT USED complex heatmap version
## make into matrix to plot
col.order <- agg_mat_all_cells_matrix <- as.matrix(agg_mat_all_cells)
make annotation
## extract pseudotime values:
pt_values <- as.data.frame(pseudotime(monocle.object, reduction_method = "UMAP"))
names(pt_values) <- "monocle_pt_sex_wt"
tenx.mutant.integrated.sex <- AddMetaData(tenx.mutant.integrated.sex, pt_values)
## save pt values
write.csv(pt_values, file = "~/data_to_export/pt_values_sex_only.csv")
library(circlize)
## make the annotation df
## get meta data from seurat object and then subset rows out that are wt
df_anno <- tenx.mutant.integrated.sex@meta.data[which(rownames(tenx.mutant.integrated.sex@meta.data) %in% colnames(agg_mat_all_cells_matrix)), ]
## get only columns of interest:
df_anno <- df_anno[ ,c("sex", "monocle_pt_sex_wt"), drop = FALSE ]
## order annotation
df_anno <- df_anno[with(df_anno, order(sex, monocle_pt_sex_wt)),]
## order cols in the matrix
agg_mat_all_cells_matrix <- agg_mat_all_cells_matrix[ ,match(colnames(agg_mat_all_cells_matrix), rownames(df_anno))]
## make annotation
heatmap_annotation <- HeatmapAnnotation(df = cluster_anno,
col = list(sex = c(male="#016c00", female="#a52b1e", `pre-det` = "#0052c5"), monocle_pt_sex_wt = colorRamp2(c(1:70), inferno(70)))
)
heatmap_annotation <- HeatmapAnnotation(sex = df_anno$sex, pt = df_anno$monocle_pt_sex_wt)
plot
## make heatmap
modules_heatmap <- Heatmap(agg_mat_all_cells_matrix,
column_order = NULL,
cluster_columns = FALSE,
cluster_rows = FALSE,
show_column_dend = FALSE,
column_labels = rep("", ncol(agg_mat_all_cells_matrix)),
#row_order = module_dendro$order,
clustering_method_columns = "ward.D2",
bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
## print
draw(modules_heatmap, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
## extract counts from 10x object
matrix_tenx_counts <- as.matrix(GetAssayData(pb_sex_filtered, assay = "RNA"))
#nk.raw.data <- as.matrix(GetAssayData(pb_sex_filtered, slot = "counts")[, WhichCells(pbmc, ident = "NK")])
## check it is the same as the merged object RNA slot
## check it is the same as the monocle object
matrix_tenx_counts_monocle <- as.matrix(as.data.frame((monocle.object@assays)))
## make heatmap
modules_heatmap <- Heatmap(matrix_tenx_counts,
column_order = NULL,
cluster_columns = TRUE,
cluster_rows = FALSE,
show_column_dend = FALSE,
column_labels = rep("", ncol(matrix_tenx_counts)),
#row_order = module_dendro$order,
clustering_method_columns = "ward.D2",
#bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
## print
draw(modules_heatmap, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
Now, we will integrate the branch data we produced using slingshot and the pseudotime values to plot this heatmap.
Monocle3 has a handy function that allows us to aggregate expression of groups of cells called aggregate_gene_expression.
The code for this is located here: https://github.com/cole-trapnell-lab/monocle3/blob/1a02274209c765fe7a60f533a31b1da3dacf6785/R/cluster_genes.R
Define the groups of cells
## Split cells into groups of sexes
female_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "female"), ]
male_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "male"), ]
pre_det_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "pre-det"), ]
## inspect range of pt values to determine bin width
hist(female_cells$PT_LineageFemale)
hist(male_cells$PT_LineageMale)
hist(pre_det_cells$PT_LineageFemale)
hist(pre_det_cells$PT_LineageMale)
Use a bin width of 2
there will be two objects for the cell_group_df: male branch and female branch. Both will include the pre-det branch
## Define male and female branch cells
# male
male_branch_meta_data <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "pre-det" | tenx.mutant.integrated.sex@meta.data$sex == "male"), ]
male_branch_meta_data <- data.frame(cell_id = rownames(male_branch_meta_data), pt = male_branch_meta_data$PT_LineageMale)
male_cell_group_df <- male_branch_meta_data
#female
female_branch_meta_data <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "pre-det" | tenx.mutant.integrated.sex@meta.data$sex == "female"), ]
female_branch_meta_data <- data.frame(cell_id = rownames(female_branch_meta_data), pt = female_branch_meta_data$PT_LineageFemale)
female_cell_group_df <- female_branch_meta_data
## what's the range of values for each pt?
range(female_cell_group_df$pt)
range(male_cell_group_df$pt)
## make bin widths
# make a new col for annotation
female_cell_group_df$pt_bin <- NA
for(i in seq(2,68,2)){
female_cell_group_df$pt_bin[which(female_cell_group_df$pt < i & female_cell_group_df$pt >= (i-2))] <- i
}
male_cell_group_df$pt_bin <- NA
for(i in seq(2,68,2)){
male_cell_group_df$pt_bin[which(male_cell_group_df$pt < i & male_cell_group_df$pt >= (i-2))] <- i
}
# then remove old pt values
male_cell_group_df <- male_cell_group_df[ ,-2]
female_cell_group_df <- female_cell_group_df[ ,-2]
## The original object contains all cells, we just want wild-type so let's subset out gene_module_df and cell_group_df accordingly
## male
## subset out only male and pre determination cells
male_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "male" | tenx.mutant.integrated.sex@meta.data$sex == "pre-det"), ]
## take forward only wild-type
male_cells <- male_cells[which(male_cells$identity_combined == "WT" | male_cells$identity_combined == "WT_10X"), ]
#male_cells <- male_cells[which(male_cells$identity_combined == "WT_10X"), ]
## get cell names
male_cells <- rownames(male_cells)
## subset our cell group df to keep only these cells
male_cell_group_df <- male_cell_group_df[which(male_cell_group_df$cell_id %in% male_cells),]
## subset Seurat object to contain cells of interest
male.seurat.object <- SubsetData(tenx.mutant.integrated.sex, cells = male_cells)
## make new counts and pheno:
data <- as(as.matrix(GetAssayData(male.seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
pd <- data.frame(male.seurat.object@meta.data)
## keep only the columns that are relevant
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
male.monocle.object <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
male.monocle.object = preprocess_cds(male.monocle.object, num_dim = 50, norm_method = "none")
## make a new dataframe for cell groups - it is crucial to refactor otherwise aggregate_gene_expression thinks it's out of bounds
male_cell_group_df <- data.frame(cell=as.character(factor(male_cell_group_df$cell_id)), cell_group=factor(male_cell_group_df$pt_bin))
## aggregate expression
male_agg_mat <- aggregate_gene_expression(male.monocle.object, gene_module_df_sex, male_cell_group_df, exclude.na = FALSE)
## female
## subset out only male and pre determination cells
female_cells <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "female" | tenx.mutant.integrated.sex@meta.data$sex == "pre-det"), ]
## take forward only wild-type
female_cells <- female_cells[which(female_cells$identity_combined == "WT" | female_cells$identity_combined == "WT_10X"), ]
#female_cells <- female_cells[which(female_cells$identity_combined == "WT_10X"), ]
## get cell names
female_cells <- rownames(female_cells)
## subset our cell group df to keep only these cells
female_cell_group_df <- female_cell_group_df[which(female_cell_group_df$cell_id %in% female_cells),]
## subset Seurat object to contain cells of interest
female.seurat.object <- SubsetData(tenx.mutant.integrated.sex, cells = female_cells)
## make new counts and pheno:
data <- as(as.matrix(GetAssayData(female.seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
pd <- data.frame(female.seurat.object@meta.data)
## keep only the columns that are relevant
#pData <- pd %>% select(orig.ident, nCount_RNA, nFeature_RNA)
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
female.monocle.object <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
female.monocle.object = preprocess_cds(female.monocle.object, num_dim = 50, norm_method = "none")
## make a new dataframe for cell groups - it is crucial to refactor otherwise aggregate_gene_expression thinks it's out of bounds
female_cell_group_df <- data.frame(cell=as.character(factor(female_cell_group_df$cell_id)), cell_group=factor(female_cell_group_df$pt_bin))
## aggregate expression
female_agg_mat <- aggregate_gene_expression(female.monocle.object, gene_module_df_sex, female_cell_group_df, exclude.na = FALSE)
## use these clusters to reorder the modules
male_agg_mat <- male_agg_mat[match(levels(gene_module_df_sex$module), row.names(male_agg_mat)), ]
female_agg_mat <- female_agg_mat[match(levels(gene_module_df_sex$module), row.names(female_agg_mat)), ]
pheatmap::pheatmap(male_agg_mat,
scale="row",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
pheatmap::pheatmap(male_agg_mat,
scale="column",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
pheatmap::pheatmap(male_agg_mat,
scale="none",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
pheatmap::pheatmap(female_agg_mat,
scale="row",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
pheatmap::pheatmap(female_agg_mat,
scale="column",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
pheatmap::pheatmap(female_agg_mat,
scale="none",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
ComplexHeatmap version
## pheatmap calculates Z scores for plotting values
scale_matrix_by_cols <- function (x)
{
m = apply(x, 1, mean, na.rm = T)
s = apply(x, 1, sd, na.rm = T)
return((x - m)/s)
}
## calculate z score by col
female_agg_mat_scaled <- t(as.matrix(scale_matrix_by_cols(t(female_agg_mat))))
male_agg_mat_scaled <- t(as.matrix(scale_matrix_by_cols(t(male_agg_mat))))
## reorder cols
female_agg_mat_scaled <- female_agg_mat_scaled[match(levels(gene_module_df_sex$module), row.names(female_agg_mat_scaled)), ]
male_agg_mat_scaled <- male_agg_mat_scaled[match(levels(gene_module_df_sex$module), row.names(male_agg_mat_scaled)), ]
## reorder based on clusters
genes_per_module <- genes_per_module[match(levels(gene_module_df_sex$module), row.names(genes_per_module)), ]
## change names for row names to include "module " at the begining of them
row.names(female_agg_mat_scaled) <- stringr::str_c("Module ", row.names(female_agg_mat_scaled))
row.names(male_agg_mat_scaled) <- stringr::str_c("Module ", row.names(male_agg_mat_scaled))
## add number of cells to the rownames for the module
for(i in seq_along(genes_per_module$Freq)){
row.names(female_agg_mat_scaled)[i] <- stringr::str_c(row.names(female_agg_mat_scaled)[i]," (n = " ,genes_per_module$Freq[i], ")")
}
for(i in seq_along(genes_per_module$Freq)){
row.names(male_agg_mat_scaled)[i] <- stringr::str_c(row.names(male_agg_mat_scaled)[i]," (n = " ,genes_per_module$Freq[i], ")")
}
## add annotation:
#heatmap_annotation <- HeatmapAnnotation(df = cluster_anno,
# col = list(
# Identity = c(Male="#016c00", Female="#a52b1e", Asexual= "#0052c5", Committed = "#f2eb23")),
# annotation_legend_param = list(Median_Pseudotime_of_Cluster = list(direction = "horizontal"), Identity = list(nrow = 1)))
library(ComplexHeatmap)
library(RColorBrewer)
modules_heatmap_female <- Heatmap(female_agg_mat_scaled,
column_order = NULL,
#row_order = row.names(female_agg_mat_scaled)[module_dendro$order],
#clustering_method_rows = "ward.D2",
#bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
modules_heatmap_male <- Heatmap(male_agg_mat_scaled,
column_order = NULL,
#row_order = module_dendro$order,
#clustering_method_rows = "ward.D2",
#bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
draw(modules_heatmap_female, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
draw(modules_heatmap_male, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
## https://www.biostars.org/p/380544/
4 bin width
## Define male and female branch cells
# male
male_branch_meta_data <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "pre-det" | tenx.mutant.integrated.sex@meta.data$sex == "male"), ]
male_branch_meta_data <- data.frame(cell_id = rownames(male_branch_meta_data), pt = male_branch_meta_data$PT_LineageMale)
male_cell_group_df <- male_branch_meta_data
#female
female_branch_meta_data <- tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "pre-det" | tenx.mutant.integrated.sex@meta.data$sex == "female"), ]
female_branch_meta_data <- data.frame(cell_id = rownames(female_branch_meta_data), pt = female_branch_meta_data$PT_LineageFemale)
female_cell_group_df <- female_branch_meta_data
## what's the range of values for each pt?
range(female_cell_group_df$pt)
range(male_cell_group_df$pt)
## make bin widths
# make a new col for annotation
female_cell_group_df$pt_bin <- NA
for(i in seq(4,68,4)){
female_cell_group_df$pt_bin[which(female_cell_group_df$pt < i & female_cell_group_df$pt >= (i-4))] <- i
}
male_cell_group_df$pt_bin <- NA
for(i in seq(4,68,4)){
male_cell_group_df$pt_bin[which(male_cell_group_df$pt < i & male_cell_group_df$pt >= (i-4))] <- i
}
# then remove old pt values
male_cell_group_df <- male_cell_group_df[ ,-2]
female_cell_group_df <- female_cell_group_df[ ,-2]
## The original object contains all cells, we just want wild-type so let's subset out gene_module_df and cell_group_df accordingly
## male
## subset our cell group df to keep only these cells
male_cell_group_df <- male_cell_group_df[which(male_cell_group_df$cell_id %in% male_cells),]
## make a new dataframe for cell groups - it is crucial to refactor otherwise aggregate_gene_expression thinks it's out of bounds
male_cell_group_df <- data.frame(cell=as.character(factor(male_cell_group_df$cell_id)), cell_group=factor(male_cell_group_df$pt_bin))
## aggregate expression
male_agg_mat <- aggregate_gene_expression(male.monocle.object, gene_module_df_sex, male_cell_group_df, exclude.na = FALSE)
## female
## subset out only male and pre determination cells
## subset our cell group df to keep only these cells
female_cell_group_df <- female_cell_group_df[which(female_cell_group_df$cell_id %in% female_cells),]
## make a new dataframe for cell groups - it is crucial to refactor otherwise aggregate_gene_expression thinks it's out of bounds
female_cell_group_df <- data.frame(cell=as.character(factor(female_cell_group_df$cell_id)), cell_group=factor(female_cell_group_df$pt_bin))
## aggregate expression
female_agg_mat <- aggregate_gene_expression(female.monocle.object, gene_module_df_sex, female_cell_group_df, exclude.na = FALSE)
## use these clusters to reorder the modules
male_agg_mat <- male_agg_mat[match(levels(gene_module_df_sex$module), row.names(male_agg_mat)), ]
female_agg_mat <- female_agg_mat[match(levels(gene_module_df_sex$module), row.names(female_agg_mat)), ]
pheatmap::pheatmap(male_agg_mat,
scale="row",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
pheatmap::pheatmap(female_agg_mat,
scale="row",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE)
male
## make monocle object with mutants
## extract data
mutant_cells_male <- rownames(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$genotype_combined == "Mutant" & tenx.mutant.integrated.sex@meta.data$sex == "male"),])
## make a new Seurat of this
seurat.object <-SubsetData(tenx.mutant.integrated.sex, cells = mutant_cells_male)
## make new counts and pheno:
## the reason we use the integrated and then subsetted is because these cells have been normalised whereas the cells in pb_sex_filtered have not been normalised (well they have but with doublets in them)
data <- as(as.matrix(GetAssayData(seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
pd <- data.frame(seurat.object@meta.data)
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
monocle.object.mutants.male <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
monocle.object.mutants.male = preprocess_cds(monocle.object.mutants.male, num_dim = 50, norm_method = "none")
### if using integrated data:
# norm_method = "none", alignment_group = "~ experiment"
## make a cell group dataframe for aggregating expression values:
mutant_cell_group_df <- data.frame(cell = row.names(monocle.object.mutants.male@colData), cell_group = monocle.object.mutants.male@colData$identity_updated)
## aggregate expression
mutant_male_agg_mat <- aggregate_gene_expression(monocle.object.mutants.male, gene_module_df_sex, mutant_cell_group_df)
plot
mutant_male_agg_mat <- mutant_male_agg_mat[match(levels(gene_module_df_sex$module), row.names(mutant_male_agg_mat)), ]
pheatmap::pheatmap(mutant_male_agg_mat,
scale="column",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = TRUE) + theme(legend.position = "bottom")
mutant_male_agg_mat <- mutant_male_agg_mat[match(levels(gene_module_df_sex$module), row.names(mutant_male_agg_mat)), ]
pheatmap::pheatmap(mutant_male_agg_mat,
scale="column",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = TRUE) + theme(legend.position = "bottom")
female
## make monocle object with mutants
## extract data
mutant_cells_female <- rownames(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$genotype_combined == "Mutant" & tenx.mutant.integrated.sex@meta.data$sex == "female"),])
## make a new Seurat of this
seurat.object <-SubsetData(tenx.mutant.integrated.sex, cells = mutant_cells_female)
## make new counts and pheno:
## the reason we use the integrated and then subsetted is because these cells have been normalised whereas the cells in pb_sex_filtered have not been normalised (well they have but with doublets in them)
data <- as(as.matrix(GetAssayData(seurat.object, assay = "RNA", slot = "data")), 'sparseMatrix')
pd <- data.frame(seurat.object@meta.data)
fData <- data.frame(gene_short_name = row.names(data), row.names = row.names(data))
## Construct monocle cds
monocle.object.mutants.female <- new_cell_data_set(expression_data = data, cell_metadata = pd, gene_metadata = fData)
## preprocess
monocle.object.mutants.female = preprocess_cds(monocle.object.mutants.female, num_dim = 50, norm_method = "none")
### if using integrated data:
# norm_method = "none", alignment_group = "~ experiment"
## make a cell group dataframe for aggregating expression values:
mutant_cell_group_df <- data.frame(cell = row.names(monocle.object.mutants.female@colData), cell_group = monocle.object.mutants.female@colData$identity_updated)
## aggregate expression
mutant_female_agg_mat <- aggregate_gene_expression(monocle.object.mutants.female, gene_module_df_sex, mutant_cell_group_df)
plot
mutant_female_agg_mat <- mutant_female_agg_mat[match(levels(gene_module_df_sex$module), row.names(mutant_female_agg_mat)), ]
pheatmap::pheatmap(mutant_female_agg_mat,
scale="column",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = FALSE) + theme(legend.position = "bottom")
mutant_female_agg_mat <- mutant_female_agg_mat[match(levels(gene_module_df_sex$module), row.names(mutant_female_agg_mat)), ]
pheatmap::pheatmap(mutant_female_agg_mat,
scale="row",
#clustering_method="ward.D2",
cluster_rows = FALSE,
cluster_cols = TRUE) + theme(legend.position = "bottom")
## landmark genes (genes of interest)
# AP2G - PBANKA-1437500
# AP2 - PBANKA-0909600 - from poran paper
# AP2G-2 - PBANKA-1034300
list_of_mutant_genes <- c("PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800")
list_of_genes_of_interest <- c("PBANKA-1437500", "PBANKA-0909600","PBANKA-1034300", "PBANKA-0828000", "PBANKA-1302700", "PBANKA-1447900", "PBANKA-0102400", "PBANKA-0716500", "PBANKA-1435200", "PBANKA-1418100", "PBANKA-1144800", "PBANKA-0902300", "PBANKA-0413400", "PBANKA-1454800")
##make df for genes of interest
genes_of_interest <- data.frame(gene = list_of_genes_of_interest, group = c(1:length(list_of_genes_of_interest)))
## aggregate expression
## make plotting df
agg_mat_genes_of_interest <- aggregate_gene_expression(monocle.object, genes_of_interest, cell_group_df)
row.names(agg_mat_genes_of_interest) <- genes_of_interest$gene
#row.names(agg_mat_genes_of_interest) <- factor(row.names(agg_mat_genes_of_interest), levels = row.names(agg_mat_genes_of_interest)[module_dendro$order])
agg_mat_genes_of_interest <- agg_mat_genes_of_interest[,match(rownames(cluster_anno), colnames(agg_mat_genes_of_interest))]
pheatmap::pheatmap(agg_mat_genes_of_interest,
scale="row",
cluster_cols = FALSE,
cluster_rows = FALSE,
clustering_method="ward.D2",
annotation_col = cluster_anno,
annotation_colors = annotation_colours)
complex heat map
## aggregate gene expression
agg_mat_genes_of_interest <- aggregate_gene_expression(monocle.object, genes_of_interest, df_all_cells)
agg_mat_genes_of_interest <- as.matrix(agg_mat_genes_of_interest)
## make heatmap
modules_heatmap <- Heatmap(agg_mat_genes_of_interest,
column_order = NULL,
cluster_columns = FALSE,
cluster_rows = FALSE,
show_column_dend = FALSE,
column_labels = rep("", ncol(agg_mat_all_cells_matrix)),
#row_order = module_dendro$order,
clustering_method_columns = "ward.D2",
bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
## print
draw(modules_heatmap, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
## order cols in the matrix
agg_mat_genes_of_interest <- agg_mat_genes_of_interest[ ,match(rownames(df_anno), colnames(agg_mat_genes_of_interest))]
## make heatmap
modules_heatmap <- Heatmap(agg_mat_genes_of_interest,
column_order = NULL,
cluster_columns = TRUE,
cluster_rows = FALSE,
show_column_dend = FALSE,
column_labels = rep("", ncol(agg_mat_all_cells_matrix)),
#row_order = module_dendro$order,
clustering_method_columns = "ward.D2",
bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
## print
draw(modules_heatmap, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
Using Seurat to visualise cells
# find markers for every cluster compared to all remaining cells, report only the positive ones
tenx.mutant.integrated.sex.markers <- FindAllMarkers(tenx.mutant.integrated.sex, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
tenx.mutant.integrated.sex.markers %>% group_by(cluster) %>% top_n(n = 2, wt = avg_logFC)
top10 <- tenx.mutant.integrated.sex.markers %>% group_by(cluster) %>% top_n(n = 10, wt = avg_logFC)
DoHeatmap(tenx.mutant.integrated.sex, features = top10$gene) + NoLegend()
But we also have the old pt values that we can use in the seurat object ie FeaturePlot(tenx.mutant.integrated.sex, reduction = “pca”, pt.size = 0.01, features = “old_pt_values”)
So let’s plot a heatmap where we plot: (x) all cells vs. (y) genes arranged by module that they belong to.
add an old pt annotation to the top
prepare data:
## extracts only 10x cells
wt_cells <- rownames(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$identity_combined == "WT_10X"),])
## make a new Seurat of this
seurat.object <-SubsetData(tenx.mutant.integrated.sex, cells = wt_cells)
DoHeatmap(seurat.object, features = top10$gene) + NoLegend()
## aggregate gene expression
agg_mat_genes_of_interest <- aggregate_gene_expression(monocle.object, genes_of_interest, df_all_cells)
agg_mat_genes_of_interest <- as.matrix(agg_mat_genes_of_interest)
## make heatmap
modules_heatmap <- Heatmap(agg_mat_genes_of_interest,
column_order = NULL,
cluster_columns = FALSE,
cluster_rows = FALSE,
show_column_dend = FALSE,
column_labels = rep("", ncol(agg_mat_all_cells_matrix)),
#row_order = module_dendro$order,
clustering_method_columns = "ward.D2",
bottom_annotation = heatmap_annotation,
col = colorRampPalette(rev(brewer.pal(n = 7, name =
"RdYlBu")))(100),
heatmap_legend_param = list(direction = "horizontal"))
## print
draw(modules_heatmap, merge_legend = TRUE, heatmap_legend_side = "bottom",
annotation_legend_side = "bottom")
Expression of CCP2 and MG1 by each genotype and each sex
# ccp2 - "PBANKA-1319500" - female 820
# MG1 - "PBANKA-0416100" - male 820
## make a custom dataframe:
marker_820_df <- as.data.frame(t(as.data.frame(tenx.mutant.integrated.sex@assays$RNA@data[c("PBANKA-1319500", "PBANKA-0416100"), ], stringsAsFactors=F)))
marker_820_df$cell_id <- row.names(marker_820_df)
sex_id <- data.frame(sex_at = tenx.mutant.integrated.sex@meta.data$at_sex, genotype = tenx.mutant.integrated.sex@meta.data$identity_updated ,cell_id = row.names(tenx.mutant.integrated.sex@meta.data))
marker_820_df <- merge(marker_820_df, sex_id, by = "cell_id")
ggplot(marker_820_df, aes(fill=sex_at, y=`PBANKA-1319500`, x=genotype)) +
geom_violin() +
geom_jitter(shape=16, position=position_jitter(0.2)) +
theme_classic() +
theme(axis.text.x = element_text(angle = 90))
#tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$sex == "pre-det" | tenx.mutant.integrated.sex@meta.data$sex == "female"), ]
VlnPlot(tenx.mutant.integrated.sex, group.by = "identity_updated", split.by = "at_sex", features = c("PBANKA-1319500"), split.plot = TRUE)
VlnPlot(tenx.mutant.integrated.sex, group.by = "identity_updated", split.by = "at_sex", features = c("PBANKA-0416100"), split.plot = TRUE)
Differential expression
Re-cluster the data so we have very course grain clusters
## find new clusters
tenx.mutant.integrated.sex <- FindClusters(tenx.mutant.integrated.sex, resolution = 1, random.seed = 42, algorithm = 2)
## plot the graph
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.1") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
VlnPlot(tenx.mutant.integrated.sex, features = c("PT_Female_UMAP", "PT_Male_UMAP"))
prep for dotplot
## make a dataframe that is the meta data
df_meta_data <- as.data.frame(tenx.mutant.integrated.sex@meta.data)
## define order for plotting
my_levels_sex <- c("2", "3", "9", "6", "11", "5", "0", "10", "13", "14", "12", "8", "15", "1", "4", "7")
## redefine order of clusters:
df_meta_data$seurat_clusters <- factor(x = df_meta_data$seurat_clusters, levels = my_levels_sex)
## make a new df of CLUSTER and IDENTITY
dot_plot_df <- as.data.frame.matrix(table(df_meta_data$seurat_clusters, df_meta_data$identity_combined))
dot_plot_df$cluster <- rownames(dot_plot_df)
## calculate percentage of cells for each genotype
dot_plot_df_pc <- (as.data.frame.matrix(prop.table(table(df_meta_data$seurat_clusters, df_meta_data$identity_combined), margin = 2)) * 100)
## make a column for cluster names
dot_plot_df_pc$cluster <- rownames(dot_plot_df_pc)
## melt dataframe for plotting
library(reshape2)
dot_plot_df_pc_melted <- melt(dot_plot_df_pc, variable.name = "cluster")
colnames(dot_plot_df_pc_melted)[2] <- "identity"
## melt the raw number too
dot_plot_df_melted <- melt(dot_plot_df, variable.name = "cluster")
colnames(dot_plot_df_melted)[2] <- "identity"
colnames(dot_plot_df_melted)[3] <- "raw_number"
## merge together
identical(dot_plot_df_melted$cluster, dot_plot_df_pc_melted$cluster)
dot_plot_merged <- cbind(dot_plot_df_melted, dot_plot_df_pc_melted)
dot_plot_merged <- dot_plot_merged[,c(1,2,3,6)]
## redefine order of clusters
dot_plot_merged$cluster <- factor(x = dot_plot_merged$cluster, levels = my_levels_sex)
## where values are zero, add NA
## find wells where it's zero
zero_values <- dot_plot_merged$value == 0
dot_plot_merged$value[zero_values] <- NA
## also do for raw number
zero_values <- dot_plot_merged$raw_number == 0
dot_plot_merged$raw_number[zero_values] <- NA
## reorder x axis:
my_levels_genotype <- c("GCSKO-oom", "GCSKO-29", "GCSKO-2", "GCSKO-19", "GCSKO-3", "GCSKO-21", "GCSKO-13", "GCSKO-28", "GCSKO-10_820", "GCSKO-17", "GCSKO-20", "WT", "WT_10X")
dot_plot_merged$identity <- factor(x = dot_plot_merged$identity, levels = my_levels_genotype)
plot
## plot
dot_plot_identity <- ggplot(dot_plot_merged, aes(y = factor(cluster), x = factor(identity))) +
## make into a dot plot
geom_point(aes(colour=value, size=raw_number)) +
scale_color_gradient(low="blue", high="red", limits=c( 0, max(dot_plot_df_pc_melted$value)), na.value="white") +
#change the colours
scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white") +
theme_classic() +
theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank(), text=element_text(size=16, family="Arial")) +
ylab("Cluster") +
xlab("Identity") +
theme(axis.text.x=element_text(size=12, angle=45, hjust=1, vjust=1), axis.text.y=element_text(size=12,), legend.position = "bottom", plot.margin = unit(c(1,3,1,3), "lines")) +
## annotate males
geom_hline(aes(yintercept = 2.5)) +
## annotate females
geom_hline(aes(yintercept = 7.5)) +
## annotate pheno 1
geom_vline(aes(xintercept = 4.5)) +
## annotate pheno 2
geom_vline(aes(xintercept = 5.5)) +
## annotate pheno 3
geom_vline(aes(xintercept = 7.5)) +
## annotate pheno 4
geom_vline(aes(xintercept = 11.5))
## print
print(dot_plot_identity)
cut off and plot
pre_branch_cells <- c(2,3)
inmature_male_cells <- c()
inmature_female_cells <-
mature_male_cells <-
mature_female_cells <-
## plot the graph
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.1") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom")
wt_cells <- row.names(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$identity_combined == "WT" | tenx.mutant.integrated.sex@meta.data$identity_combined == "WT_10X"), ])
male_inmature_cells <- row.names(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$identity_combined == "WT" | tenx.mutant.integrated.sex@meta.data$identity_combined == "WT_10X"), ])
early_wt_cells <- row.names(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$identity_combined == "WT" && tenx.mutant.integrated.sex@meta.data$PT_LineageFemale < 46), ])
FindMarkers(tenx.mutant.integrated.sex, cells.1 = , cells.2 = )
We must now recalculate the clusters to gain a better understanding of the heterogeneity of the subset. Since using the previous clusters, the asexual cells obscured some of the variation.
## copy old clusters
tenx.mutant.integrated.sex <- AddMetaData(tenx.mutant.integrated.sex, tenx.mutant.integrated.sex@meta.data$seurat_clusters, col.name = "pre_sex_clusters")
## generate new clusters at various resolutions
tenx.mutant.integrated.sex <- FindNeighbors(tenx.mutant.integrated.sex, dims = 1:15)
tenx.mutant.integrated.sex <- FindClusters(tenx.mutant.integrated.sex, resolution = c(2,3,4,5,6), random.seed = 42, algorithm = 2)
View the clusters at different resolutions to chose the appropraite resolution
## plot
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.2") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.3") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.4") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.5") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "integrated_snn_res.6") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
Go with 4 clusters
tenx.mutant.integrated.sex <- FindClusters(tenx.mutant.integrated.sex, resolution = 4, random.seed = 42, algorithm = 2)
DimPlot(tenx.mutant.integrated.sex, reduction = "umapoptimised_post_repca", label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5) +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
You can also use the original UMAP projection
DimPlot(tenx.mutant.integrated.sex, label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5) +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
look at cluster representation
plot
## this function writes the next bit of code for you
ploty <- c()
for(i in seq_along(levels(tenx.mutant.integrated.sex@meta.data$seurat_clusters))){
ploty <- paste0(ploty, "list_UMAPs_by_cluster[[", i, "]]", " + ")
}
## plot
library(gridExtra)
grid.arrange(list_UMAPs_by_cluster[[1]] , list_UMAPs_by_cluster[[2]] , list_UMAPs_by_cluster[[3]] , list_UMAPs_by_cluster[[4]] , list_UMAPs_by_cluster[[5]] , list_UMAPs_by_cluster[[6]] , list_UMAPs_by_cluster[[7]] , list_UMAPs_by_cluster[[8]] , list_UMAPs_by_cluster[[9]] , list_UMAPs_by_cluster[[10]] , list_UMAPs_by_cluster[[11]] , list_UMAPs_by_cluster[[12]] , list_UMAPs_by_cluster[[13]] , list_UMAPs_by_cluster[[14]] , list_UMAPs_by_cluster[[15]] , list_UMAPs_by_cluster[[16]] , list_UMAPs_by_cluster[[17]] , list_UMAPs_by_cluster[[18]] , list_UMAPs_by_cluster[[19]] , list_UMAPs_by_cluster[[20]] , list_UMAPs_by_cluster[[21]] , list_UMAPs_by_cluster[[22]] , list_UMAPs_by_cluster[[23]] , list_UMAPs_by_cluster[[24]] , list_UMAPs_by_cluster[[25]] , list_UMAPs_by_cluster[[26]] , list_UMAPs_by_cluster[[27]] , list_UMAPs_by_cluster[[28]] , list_UMAPs_by_cluster[[29]] , list_UMAPs_by_cluster[[30]] , list_UMAPs_by_cluster[[31]], ncol = 5)
## for loop which takes each cluster and makes a list of cells and then plots a highlighted plot and adds it to a list
## make a df of the number of
n_per_cluster <- as.data.frame(table(tenx.mutant.integrated.sex@meta.data$seurat_clusters))
rownames(n_per_cluster) <- n_per_cluster$Var1
n_per_cluster <- n_per_cluster[,2]
## make a blank list
list_UMAPs_by_cluster <- vector(mode = "list", length = length(levels(tenx.mutant.integrated.sex@meta.data$seurat_clusters)))
## for loop
for(i in seq_along(levels(tenx.mutant.integrated.sex@meta.data$seurat_clusters))){
## make a list of cells
list_of_cells <- rownames(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$seurat_clusters == levels(tenx.mutant.integrated.sex@meta.data$seurat_clusters)[i]), ])
umap_plot <- DimPlot(tenx.mutant.integrated.sex, reduction = "umap", label = FALSE, repel = TRUE, pt.size = 0.1, cells.highlight = list_of_cells) +
coord_fixed() +
scale_color_manual(values=c("#000000", "#f54e1e")) +
theme_void() +
labs(title = paste("cluster", levels(tenx.mutant.integrated.sex@meta.data$seurat_clusters)[i], "\n", "(n = ", n_per_cluster[i],")")) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "none")
## add to the list
list_UMAPs_by_cluster[[i]] <- umap_plot
}
## plot
grid.arrange(list_UMAPs_by_cluster[[1]] , list_UMAPs_by_cluster[[2]] , list_UMAPs_by_cluster[[3]] , list_UMAPs_by_cluster[[4]] , list_UMAPs_by_cluster[[5]] , list_UMAPs_by_cluster[[6]] , list_UMAPs_by_cluster[[7]] , list_UMAPs_by_cluster[[8]] , list_UMAPs_by_cluster[[9]] , list_UMAPs_by_cluster[[10]] , list_UMAPs_by_cluster[[11]] , list_UMAPs_by_cluster[[12]] , list_UMAPs_by_cluster[[13]] , list_UMAPs_by_cluster[[14]] , list_UMAPs_by_cluster[[15]] , list_UMAPs_by_cluster[[16]] , list_UMAPs_by_cluster[[17]] , list_UMAPs_by_cluster[[18]] , list_UMAPs_by_cluster[[19]] , list_UMAPs_by_cluster[[20]] , list_UMAPs_by_cluster[[21]] , list_UMAPs_by_cluster[[22]] , list_UMAPs_by_cluster[[23]] , list_UMAPs_by_cluster[[24]] , list_UMAPs_by_cluster[[25]] , list_UMAPs_by_cluster[[26]] , list_UMAPs_by_cluster[[27]] , list_UMAPs_by_cluster[[28]] , list_UMAPs_by_cluster[[29]] , list_UMAPs_by_cluster[[30]] , list_UMAPs_by_cluster[[31]], ncol = 5)
##This is the set up for this:
## two clusters that differ
table(tenx.mutant.integrated.sex@meta.data$seurat_clusters, tenx.mutant.integrated.sex@meta.data$pre_sex_clusters)
## hemberg uses gvisSankey in https://github.com/hemberg-lab/scmap/blob/3aa2bb487a80a946469393857cea6a6effc618fb/R/Utils.R code - so maybe update with this?
## make a dataframe that is the meta data
df_meta_data <- as.data.frame(tenx.mutant.integrated.sex@meta.data)
df_alluvial <- melt(table(data.frame(full_clusters = df_meta_data$pre_sex_clusters, sex_clusters = df_meta_data$seurat_clusters)))
## load required package
#library(ggalluvial)
## plot
ggplot(df_alluvial, aes(y = value, axis1 = full_clusters, axis2 = sex_clusters)) +
geom_alluvium(aes(fill = sex_clusters),
width = 0, knot.pos = 0, reverse = FALSE) +
guides(fill = FALSE) +
geom_stratum(width = 1/8, reverse = FALSE) +
geom_text(stat = "stratum", infer.label = TRUE, reverse = FALSE) +
scale_x_continuous(breaks = 1:2, labels = c("original cluster", "Sex Cluster")) +
coord_flip() +
ggtitle("Cluster Identiity in full dataset vs. sex only") +
theme_classic()
prep for dotplot
## make a dataframe that is the meta data
df_meta_data <- as.data.frame(tenx.mutant.integrated.sex@meta.data)
## define order for plotting
my_levels_sex <- c("0", "9", "28", "2", "29", "16", "22", "23", "15", "21", "30", "4", "3", "18", "7", "8", "6", "20", "12", "26", "13", "5", "11", "25", "1", "17", "10", "24", "14", "27", "19")
## redefine order of clusters:
df_meta_data$seurat_clusters <- factor(x = df_meta_data$seurat_clusters, levels = my_levels_sex)
## make a new df of CLUSTER and IDENTITY
dot_plot_df <- as.data.frame.matrix(table(df_meta_data$seurat_clusters, df_meta_data$identity_combined))
dot_plot_df$cluster <- rownames(dot_plot_df)
## calculate percentage of cells for each genotype
dot_plot_df_pc <- (as.data.frame.matrix(prop.table(table(df_meta_data$seurat_clusters, df_meta_data$identity_combined), margin = 2)) * 100)
## make a column for cluster names
dot_plot_df_pc$cluster <- rownames(dot_plot_df_pc)
## melt dataframe for plotting
library(reshape2)
dot_plot_df_pc_melted <- melt(dot_plot_df_pc, variable.name = "cluster")
colnames(dot_plot_df_pc_melted)[2] <- "identity"
## melt the raw number too
dot_plot_df_melted <- melt(dot_plot_df, variable.name = "cluster")
colnames(dot_plot_df_melted)[2] <- "identity"
colnames(dot_plot_df_melted)[3] <- "raw_number"
## merge together
identical(dot_plot_df_melted$cluster, dot_plot_df_pc_melted$cluster)
dot_plot_merged <- cbind(dot_plot_df_melted, dot_plot_df_pc_melted)
dot_plot_merged <- dot_plot_merged[,c(1,2,3,6)]
## redefine order of clusters
dot_plot_merged$cluster <- factor(x = dot_plot_merged$cluster, levels = my_levels_sex)
## where values are zero, add NA
## find wells where it's zero
zero_values <- dot_plot_merged$value == 0
dot_plot_merged$value[zero_values] <- NA
## also do for raw number
zero_values <- dot_plot_merged$raw_number == 0
dot_plot_merged$raw_number[zero_values] <- NA
## reorder x axis:
my_levels_genotype <- c("GCSKO-oom", "GCSKO-29", "GCSKO-2", "GCSKO-19", "GCSKO-3", "GCSKO-21", "GCSKO-13", "GCSKO-28", "GCSKO-10_820", "GCSKO-17", "GCSKO-20", "WT", "WT_10X")
dot_plot_merged$identity <- factor(x = dot_plot_merged$identity, levels = my_levels_genotype)
plot
## plot
dot_plot_identity <- ggplot(dot_plot_merged, aes(y = factor(cluster), x = factor(identity))) +
## make into a dot plot
geom_point(aes(colour=value, size=raw_number)) +
scale_color_gradient(low="blue", high="red", limits=c( 0, max(dot_plot_df_pc_melted$value)), na.value="white") +
#change the colours
scale_colour_viridis(option = "inferno", guide = "colourbar", na.value="white") +
theme_classic() +
theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank(), text=element_text(size=16, family="Arial")) +
ylab("Cluster") +
xlab("Identity") +
theme(axis.text.x=element_text(size=12, angle=45, hjust=1, vjust=1), axis.text.y=element_text(size=12,), legend.position = "bottom", plot.margin = unit(c(1,3,1,3), "lines")) +
## annotate males
geom_hline(aes(yintercept = 5.5)) +
## annotate females
geom_hline(aes(yintercept = 20.5)) +
## annotate pheno 1
geom_vline(aes(xintercept = 4.5)) +
## annotate pheno 2
geom_vline(aes(xintercept = 5.5)) +
## annotate pheno 3
geom_vline(aes(xintercept = 7.5)) +
## annotate pheno 4
geom_vline(aes(xintercept = 11.5))
## print
print(dot_plot_identity)
We can then use Slingshot to plot a Pseudotime and extract mutually exclusive parts of the trajectory (Male and Female) as well as the common stalk of both trajectories.
packages
library(slingshot)
library(scater)
Data In
## extract the data from the objects and save to export to Arthur
integrated_sex_counts <- tenx.mutant.integrated.sex@assays$integrated@data
integrated_sex_pheno <- tenx.mutant.integrated.sex@meta.data
#saveRDS(integrated_sex_counts, file="~/data_to_export/integrated_sex_counts.RDS")
#saveRDS(integrated_sex_pheno, file="~/data_to_export/integrated_sex_pheno.RDS")
#sexcount<-readRDS("/Users/talman/Google\ Drive/ActiveSanger/sexpaper/integrated_sex_counts.RDS")
#sexpheno<-readRDS("/Users/talman/Google\ Drive/ActiveSanger/sexpaper/integrated_sex_pheno.RDS")
sexcount <- integrated_sex_counts
sexpheno <- integrated_sex_pheno
## technically this is a shortcut but it didn't work
##https://satijalab.org/seurat/v3.1/conversion_vignette.html
#convert Seurat to SCE object:
#pbmc.sce <- as.SingleCellExperiment(tenx.mutant.integrated.sex), assay = "integrated")
Preprocess
## make a single cell experiment object, which is the input for Slingshot
sexbranch <- SingleCellExperiment(assays = list(
counts = as.matrix(sexcount),
logcounts = as.matrix(sexcount)
), colData = sexpheno)
## subset wild-type cells
sexbranchWT<- sexbranch[, colData(sexbranch)$genotype == "WT"|colData(sexbranch)$experiment == "tenx_5k"]
## calculate the QC metrics
sexbranchWT<-calculateQCMetrics(sexbranchWT)
## set up the colour pal
nb.cols <- 18
mycolors <- colorRampPalette(brewer.pal(9, "Set1"))(nb.cols)
Calculate the PCS
## calculate PCA
pca <- prcomp(t(assays(sexbranchWT)$counts), scale. = FALSE)
## subset coordinates
rd1 <- pca$x[,1:2]
## plot
plot(rd1, col = rgb(0,0,0,.5), pch=16, asp = 2)
## cluster using kmeans
## you need to set a seed here to ensure the results are reproducible
set.seed(42)
cl2 <- kmeans(rd1, centers = 13)$cluster
## plot
plot(rd1, col = mycolors[cl2], asp = 3, pch = 16)
## make a nicer plot so we can interpret the clusters
df_plotting <- as.data.frame(cbind(rd1, cl2))
## change to character to make it discrete
df_plotting$cl2 <- as.character(df_plotting$cl2)
## plot
ggplot(df_plotting, aes(x = PC1, y = PC2, colour = cl2)) +
geom_point() +
scale_colour_manual(values = rainbow(13)) +
theme_classic()
## initialise plot to prevent error
plot.new()
## slingshot to get lineages
lin1 <- getLineages(rd1, cl2,start.clus = '8')
## make a curve through lineage
crv1 <- getCurves(lin1)
## join points with line segments and plot
plot(rd1, col = mycolors[cl2], asp = 3, pch = 16)
lines(crv1, lwd = 3, col = 'black')
## add PCA coordinates to SCE object
reducedDims(sexbranchWT) <- SimpleList(PCA = rd1)
## add clusters to SCE object
sexbranchWT$GMM<-cl2
## Add pseudotimes to SCE object
sexbranchWT$PT_LineageFemale<-as.data.frame(slingPseudotime(crv1))$curve1
sexbranchWT$PT_LineageMale<-as.data.frame(slingPseudotime(crv1))$curve2
## add designation to SCE object
sexbranchWT$male<-is.na(sexbranchWT$PT_LineageFemale)
sexbranchWT$female<-is.na(sexbranchWT$PT_LineageMale)
vec <- vector()
for (i in 1:length(sexbranchWT$male)) {
if (sexbranchWT$male[i] == sexbranchWT$female[i]) {vec<-c(vec,"pre-det")}
if (sexbranchWT$male[i] == TRUE) {vec<-c(vec,"male")}
if (sexbranchWT$female[i] == TRUE) {vec<-c(vec,"female")}
}
sexbranchWT$sex<-vec
## plot coloured by NEK3 (PBANKA_0600600)
plotPCA(sexbranchWT,shape_by="sex",colour_by="PBANKA-0600600")
## extracts only 10x cells
wt_cells <- rownames(tenx.mutant.integrated.sex@meta.data[which(tenx.mutant.integrated.sex@meta.data$identity_combined == "WT_10X"),])
## make a new Seurat of this
seurat.object <-SubsetData(tenx.mutant.integrated.sex, cells = wt_cells)
## get UMAP coordinates
umap_coords <- seurat.object@reductions$umapoptimised_post_repca@cell.embeddings
## get clusters
#clusters <- as.list(seurat.object@meta.data$integrated_snn_res.4)
#names(clusters) <- rownames(seurat.object@meta.data)
#clusters <- as.list(clusters)
## cluster using kmeans
## you need to set a seed here to ensure the results are reproducible
set.seed(42)
clusters <- kmeans(umap_coords, centers = 13)$cluster
## plot
## make a nicer plot so we can interpret the clusters
df_plotting <- as.data.frame(cbind(umap_coords, clusters))
## change to character to make it discrete
df_plotting$clusters <- as.character(df_plotting$clusters)
## plot
ggplot(df_plotting, aes(x = umapoptimised_1, y = umapoptimised_2, colour = clusters)) +
geom_point() +
scale_colour_manual(values = rainbow(15)) +
theme_classic()
## initialise plot to prevent error
plot.new()
## slingshot to get lineages
lineage_uamp <- getLineages(umap_coords, clusters, start.clus = '6', end.clus = c('1', '12'))
## make a curve through lineage
crv1 <- getCurves(lineage_uamp)
## join points with line segments and plot
plot(umap_coords, col = mycolors[clusters], asp = 3, pch = 16)
lines(crv1, lwd = 3, col = 'black')
Add data to Seurat:
## extract data to add to Seurat
## extract clusters
meta_data_to_add_from_slingshot <- data.frame(clusters_k_means_UMAP = clusters)
## Add pseudotimes
# check the length of each branch to see which curve is which using: sum(is.na(as.data.frame(slingPseudotime(crv1))$curve1))
# then inspect using the ggplot2 above to where males are -
# tail(as.data.frame(slingPseudotime(crv1)), 100)
# tail(meta_data_to_add_from_slingshot, 100)
meta_data_to_add_from_slingshot$PT_Female_UMAP <- as.data.frame(slingPseudotime(crv1))$curve1
meta_data_to_add_from_slingshot$PT_Male_UMAP <- as.data.frame(slingPseudotime(crv1))$curve2
## add designation to SCE object
meta_data_to_add_from_slingshot$sex_UMAP <- "pre-det"
meta_data_to_add_from_slingshot$sex_UMAP[which(is.na(meta_data_to_add_from_slingshot$PT_Female_UMAP))] <- "male"
meta_data_to_add_from_slingshot$sex_UMAP[which(is.na(meta_data_to_add_from_slingshot$PT_Male_UMAP))] <- "female"
## if there are 3 curves in slingPseudotime(crv1):
#meta_data_to_add_from_slingshot$sex_UMAP[which(is.na(meta_data_to_add_from_slingshot$PT_Female_UMAP) & is.na(as.data.frame(slingPseudotime(crv1))$curve3))] <- "male"
#meta_data_to_add_from_slingshot$sex_UMAP[which(is.na(meta_data_to_add_from_slingshot$PT_Male_UMAP) & is.na(as.data.frame(slingPseudotime(crv1))$curve3))] <- "female"
## add clusters to SCE object
tenx.mutant.integrated.sex <- AddMetaData(tenx.mutant.integrated.sex, meta_data_to_add_from_slingshot)
## plot
FeaturePlot(tenx.mutant.integrated.sex, label.size = 5, pt.size = 0.5, features = c("PT_Female_UMAP", "PT_Male_UMAP")) +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
plot sex designations
DimPlot(tenx.mutant.integrated.sex, label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "sex_UMAP", na.value = "white") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
Load package
library(scmap)
Set up Index
#the reference dataset is: sexbranchWT
## set up a feature symbol column in the SCE object
rowData(sexbranchWT)$feature_symbol <- rownames(sexbranchWT)
## make an is_expr assay which only counts values with more than 0
assay(sexbranchWT, "is_expr") <- counts(sexbranchWT) > 0
## Select the top 500 most important genes
sexbranchWT <- selectFeatures(sexbranchWT, suppress_plot = FALSE, n_features = 500)
## inspect features selected
table(rowData(sexbranchWT)$scmap_features)
## create reference index
sexbranchWT <- indexCell(sexbranchWT)
## inspect results
names(metadata(sexbranchWT)$scmap_cell_index)
length(metadata(sexbranchWT)$scmap_cell_index$subcentroids)
dim(metadata(sexbranchWT)$scmap_cell_index$subcentroids[[1]])
Query dataset
#query data set is called: sexbranch
z <- sexbranch
## add feature symbol to SCE object
rowData(z)$feature_symbol <- rownames(z)
## map cells
scmapCell_results <- scmapCell(z, list(yan = metadata(sexbranchWT)$scmap_cell_index))
## copy over PCA coordinate
colData(sexbranchWT)$PC1<-as.data.frame(reducedDim(sexbranchWT))$PC1
colData(sexbranchWT)$PC2<-as.data.frame(reducedDim(sexbranchWT))$PC2
## Get nearest cell - which is located in the first row and make a list
getcells <- scmapCell_results$yan$cells[1, ]
## Extract meta data for these cells
cdsce <- colData(sexbranchWT)[getcells, ]
## Get similarity scores
topsim <- scmapCell_results$yan$similarities[1, ]
## add meta data to query dataset
# add nearest cell
z$top_pbcell <- getcells
## add PCA coordinates
z$PC1 <- cdsce$PC1
z$PC2 <- cdsce$PC2
## add assigned sex
z$sex<- cdsce$sex
## add pt
z$PT_LineageFemale<- cdsce$PT_LineageFemale
z$PT_LineageMale<- cdsce$PT_LineageMale
## add similarity score
z$topsim <- topsim
## inspect similarity scores
hist(z$topsim)
## count anything with a similarity score below 0.4 as unassigned
z$topcell_sp[z$topsim < 0.4] <- "unassigned"
z$topcell_sp <- as.factor(z$topcell_sp)
z$yt<-rep("assigned",length(z$topcell_sp))
z$yt[z$topsim < 0.4] <- "unassigned"
## extract PC scores
no <- as.data.frame(reducedDim(sexbranchWT)[,1:2])
## extract meta data
number2 <- as.data.frame(cdsce)
## extract top cell
number2$topcell_sp <- z$yt
## plot
ggplot(no, aes(PC1, PC2)) +
geom_point(size = 1,alpha = 1/10) +
geom_point(aes(x=PC1, y=PC2,shape=factor(topcell_sp)), data=number2, size=2, colour="black") +
theme_classic() + scale_shape_manual(values=c(1,2))+
scale_color_manual( values = c("0" = "#1f77b4","1" ="#ff7f0e","2" ="#2ca02c","3,0" ="#d62728","3,1" ="#9467bd","3,2"="#8c564b","3,3"="#e377c2","4"="#bcbd22","5"="#17becf","6"="#aec7e8")) + labs(x="PC1", y="PC2")
#+
# theme(legend.position="none", axis.title=element_text(size=8), legend.text = element_text(size = , legend.title = element_text(size #= 8), axis.text = element_text(size=:sunglasses:, axis.text.x = element_blank(), axis.text.y = element_blank())
read in Arthur’s data
## read in data
at_data <- readRDS("~/data/sexbranch")
## extract values of interest
at_meta_data <- as.data.frame(at_data@colData)
## look at new cols:
head(table(at_meta_data$top_pbcell))
head(table(at_meta_data$topsim))
head(table(at_meta_data$topcell_sp))
head(table(at_meta_data$yt))
head(table(at_meta_data$sex))
#table(at_meta_data$PT_LineageFemale)
#table(at_meta_data$PT_LineageMale)
Add meta data to Seurat object
colnames_to_add_to_meta_data <- c("top_pbcell", "topsim", "topcell_sp", "yt", "sex", "PT_LineageFemale", "PT_LineageMale")
tenx.mutant.integrated.sex <- AddMetaData(tenx.mutant.integrated.sex, metadata = at_meta_data[,which(colnames(at_meta_data) %in% colnames_to_add_to_meta_data)], col.name = c("at_top_pbcell", "at_topsim", "at_topcell_sp", "at_yt", "at_sex", "at_PT_LineageFemale", "at_PT_LineageMale"))
## to remove columns in metadata:
#tenx.mutant.integrated.sex@meta.data[136:144] <- NULL
#tenx.mutant.integrated.sex@meta.data[['NAME_OF_COL']] <- NULL
Plot sexes
DimPlot(tenx.mutant.integrated.sex, label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "at_sex") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
## the slingshot pt values have NAs for e.g. male cells in the PT_LineageFemale, we can deal with this later, but for now, we will just ignore this as Dimplot will not plot NA values
#sum(is.na(tenx.mutant.integrated.sex@meta.data$at_PT_LineageMale))
#sum(is.na(tenx.mutant.integrated.sex@meta.data$at_PT_LineageFemale))
## plot
FeaturePlot(tenx.mutant.integrated.sex, label.size = 5, pt.size = 0.5, features = c("at_PT_LineageFemale", "at_PT_LineageMale")) +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
table(tenx.mutant.integrated.sex@meta.data$sex)
male_clusters <- c("13", "5", "11", "25", "1", "17", "10", "24", "14", "27", "19")
female_clusters <- c("16", "22", "23", "15", "21", "30", "4", "3", "18", "7", "8", "6", "20", "12", "26")
asex_clusters <- c("0", "9", "28", "2", "29")
tenx.mutant.integrated.sex@meta.data$sex_designation_using_clusters <- NA
tenx.mutant.integrated.sex@meta.data$sex_designation_using_clusters[which(tenx.mutant.integrated.sex@meta.data$seurat_clusters %in% male_clusters)] <- "Male"
tenx.mutant.integrated.sex@meta.data$sex_designation_using_clusters[which(tenx.mutant.integrated.sex@meta.data$seurat_clusters %in% female_clusters)] <- "Female"
tenx.mutant.integrated.sex@meta.data$sex_designation_using_clusters[which(tenx.mutant.integrated.sex@meta.data$seurat_clusters %in% asex_clusters)] <- "Pre-det"
table(tenx.mutant.integrated.sex@meta.data$sex_designation_using_clusters)
table(tenx.mutant.integrated.sex@meta.data$sex, tenx.mutant.integrated.sex@meta.data$sex_designation_using_clusters)
Plot sexes
DimPlot(tenx.mutant.integrated.sex, label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "sex") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
Plot sexes
DimPlot(tenx.mutant.integrated.sex, label = TRUE, repel = TRUE, label.size = 5, pt.size = 0.5, group.by = "sex_designation_using_clusters") +
coord_fixed() +
theme_void() +
theme(legend.position = "bottom") +
guides(colour=guide_legend(nrow=3,byrow=TRUE, override.aes = list(size=4)))
correlation of monocle PT and
## extract pseudotime values:
pt_values <- as.data.frame(pseudotime(monocle.object, reduction_method = "UMAP"))
pt_values$cell_name <- rownames(pt_values)
meta_data_df <- as.data.frame(monocle.object@colData)
meta_data_df$cell_name <- rownames(meta_data_df)
meta_data_df <- merge(meta_data_df, pt_values, by = "cell_name")
names(meta_data_df)[142]<- "pt"
male_pt_correlation_df <- meta_data_df[which(meta_data_df$cell_name %in% male_cells), ]
female_pt_correlation_df <- meta_data_df[which(meta_data_df$cell_name %in% female_cells), ]
ggplot(male_pt_correlation_df, aes(x = PT_LineageMale, y = pt, colour = sex)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
theme_classic()
ggplot(female_pt_correlation_df, aes(x = PT_LineageFemale, y = pt, colour = sex)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
theme_classic()
ggplot(male_pt_correlation_df, aes(x = PT_Female_UMAP, y = pt)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
theme_classic()
save environment
## This saves everything in the global environment for easy recall later
#save.image(file = "GCSKO_Sex_Branch_Analysis.RData")
#load(file = "GCSKO_Sex_Branch_Analysis.RData")
save modules
#gene_module_df_sex
write.csv(gene_module_df_sex, file = "../data_to_export/gene_module_df_sex.csv")
#save(pb_30k_sex_filtered, pb_sex_filtered, file = "Part_2_input.Rdata")
Save object(s)
## save integrated object to file
#saveRDS(tenx.mutant.integrated.sex, file = "../data_to_export/tenx.mutant.integrated.sex.processed.RDS")
## restore the object
#tenx.mutant.integrated <- readRDS("../data_to_export/tenx.mutant.integrated.sex.processed.RDS")
Seurat:::DoHeatmap
monocle:::plot_pseudotime_heatmap
getAnywhere(aggregate_gene_expression)
A single object matching ‘aggregate_gene_expression’ was found
It was found in the following places
package:monocle3
namespace:monocle3
with value
function (cds, gene_group_df = NULL, cell_group_df = NULL, norm_method = c("log",
"binary", "size_only"), pseudocount = 1, scale_agg_values = TRUE,
max_agg_value = 3, min_agg_value = -3, exclude.na = TRUE)
{
if (is.null(gene_group_df) && is.null(cell_group_df))
stop("Error: one of either gene_group_df or cell_group_df must not be NULL")
agg_mat <- normalized_counts(cds, norm_method = norm_method,
pseudocount = pseudocount)
if (is.null(gene_group_df) == FALSE) {
gene_group_df <- as.data.frame(gene_group_df)
gene_group_df <- gene_group_df[gene_group_df[, 1] %in%
fData(cds)$gene_short_name | gene_group_df[, 1] %in%
row.names(fData(cds)), , drop = FALSE]
short_name_mask <- gene_group_df[[1]] %in% fData(cds)$gene_short_name
if (any(short_name_mask)) {
geneids <- as.character(gene_group_df[[1]])
geneids[short_name_mask] <- row.names(fData(cds))[match(geneids[short_name_mask],
fData(cds)$gene_short_name)]
gene_group_df[[1]] <- geneids
}
agg_mat = as.matrix(my.aggregate.Matrix(agg_mat[gene_group_df[,
1], ], as.factor(gene_group_df[, 2]), fun = "sum"))
if (scale_agg_values) {
agg_mat <- t(scale(t(agg_mat)))
agg_mat[agg_mat < min_agg_value] <- min_agg_value
agg_mat[agg_mat > max_agg_value] <- max_agg_value
}
}
if (is.null(cell_group_df) == FALSE) {
cell_group_df = as.data.frame(cell_group_df)
cell_group_df = cell_group_df[cell_group_df[, 1] %in%
row.names(pData(cds)), , drop = FALSE]
agg_mat = agg_mat[, cell_group_df[, 1]]
agg_mat = my.aggregate.Matrix(Matrix::t(agg_mat), as.factor(cell_group_df[,
2]), fun = "mean")
agg_mat = Matrix::t(agg_mat)
}
if (exclude.na) {
agg_mat <- agg_mat[rownames(agg_mat) != "NA", colnames(agg_mat) !=
"NA", drop = FALSE]
}
return(agg_mat)
}
<bytecode: 0x7fbfccca4ff0>
<environment: namespace:monocle3>
so essentially it first takes the counts matrix and subsets it by your gene groups, then it sums the values - if you specify scale then it will calcualte the z score of the transformed dataframe
construct map
## construct diffusion map
## http://www.bioconductor.org/packages/devel/bioc/vignettes/slingshot/inst/doc/vignette.html
## input is a transformed expression matrix (genes as cols and cells as rows)
dm <- DiffusionMap(t(as.data.frame(tenx.mutant.integrated.sex@assays$integrated@data)))
## extract meta data for plotting
df_meta_data <- (as.data.frame(tenx.mutant.integrated.sex@meta.data))
## make combined dataframe
rd2 <- as.data.frame(cbind(DC1 = dm$DC1, DC2 = dm$DC2, identity = as.factor(as.character(tenx.mutant.integrated.sex@meta.data$post_integration_clusters))))
## plot
ggplot(rd2, aes(x = DC1, y = DC2, colour = as.character(identity))) + geom_point(size = 1) +
theme(axis.ticks.y = element_blank()) +
theme_classic()
## make a density plot for real time correlations using kasia data
## make an annotation dataframe
anno_real_time <- data.frame(monocle.object@colData$cluster_colours_figure, monocle.object@colData$pt, monocle.object@colData$Prediction.Spearman._Kasia, row.names = rownames(monocle.object@colData))
names(anno_real_time) <- c("sex", "Pseudotime", "real_time")
## change the order of the cols (cells) in data frame
col.order <- rownames(anno_real_time[with(anno_real_time, order(sex, Pseudotime)), ])
anno_real_time <- anno_real_time[col.order,]
## add an "order" col to the dataframe to assist in plotting
anno_real_time$order <- c(1:nrow(anno_real_time))
#### plot density plot x
dens1 <- ggplot(anno_real_time, aes(x = order, fill = as.factor(real_time))) +
geom_density(alpha = 0.2) +
#theme_void() +
#theme(legend.position = "none")
## add annotations for sex
geom_rect(data = cluster_anno, mapping=aes(xmin=x1, xmax=x2, ymin=y1, ymax=y2, fill=Median_Pseudotime_of_Cluster), inherit.aes = FALSE) +
scale_fill_viridis_c(option = "plasma") +
## flip coordinates
## geoms below will use another color scale
new_scale_fill() +
geom_rect(data = cluster_anno, mapping=aes(xmin=Bx1, xmax=Bx2, ymin=By1, ymax=By2, fill=Identity), inherit.aes = FALSE) +
scale_fill_manual(values = c("Male" ="#016c00", "Female"="#a52b1e", "Asexual"= "#0052c5", "Committed" = "#f2eb23"))
dens1